📚 Michele's Notes

            • 24.10.14 (1) Dialing in The Steam & Co. on Wacaco Picopresso
            • 24.10.15 (1) Dialing in The Steam & Co. on Wacaco Picopresso
            • 24.10.15 (2) Dialing in The Steam & Co. on Wacaco Picopresso
            • 24.10.16 (1) Daily Coffee with The Steam & Co. on Wacaco Picopresso
            • 24.10.16 (2) Daily Coffee with The Steam & Co. on Wacaco Picopresso
            • 24.10.16 (3) Daily Coffee with The Steam & Co. on Wacaco Picopresso
            • 24.10.21 (1) Dialing in La Esmeralda on Wacaco Picopresso
            • 24.10.21 (2) Dialing in La Esmeralda on Wacaco Picopresso
            • 24.10.21 (3) Dialing in La Esmeralda on Wacaco Picopresso
            • 24.10.27 (1) Dialing in La Esmeralda on La Pavoni Europiccola
            • 24.10.28 (1) Dialing in Filter C on the Sworkdesign Dripper
            • 24.10.28 (1) Dialing in La Esmeralda on La Pavoni Europiccola
            • 24.11.01 (1) Dialing Dambi Uddo on La Pavoni Europiccola
            • 24.11.01 (2) Dialing Dambi Uddo on La Pavoni Europiccola
            • 24.11.01 (1) Daily Coffee with Filter C on the Sworkdesign Dripper
            • 24.11.03 (1) Daily Coffee Dambi Uddo on La Pavoni Europiccola
            • 24.12.29 (1) Daily Coffee Coopchebi on La Pavoni Europiccola
            • 25.01.01 (1) Daily Coffee Coopchebi on La Pavoni Europiccola
            • 25.01.01 (2) Daily Coffee Coopchebi on La Pavoni Europiccola
            • 25.01.03 Dialing In (1) Salaverria on La Pavoni Europiccola
            • 25.01.03 Dialing In (2) Salaverria on La Pavoni Europiccola
            • 25.01.09 Daily Espresso (1) Salaverria on La Pavoni Europiccola
            • 25.01.10 Dialing In (1) Coopchebi on Wacaco Picopresso
            • 25.01.10 Dialing In (2) Coopchebi on Wacaco Picopresso
            • 25.01.11 Daily Espresso (1) Salaverria on La Pavoni Europiccola
            • 25.01.11 Dialing In (2) Los Bellotos on La Pavoni Europiccola
            • 25.01.11 Dialing In (3) Los Bellotos on La Pavoni Europiccola
          • 2024.10.01 The Steam & Co.
          • 2024.10.04 La Esmeralda
          • 2024.10.23 Dambi Uddo
          • 2024.10.23 Filter C
          • 2024.11.13 Coopchebi
          • 2024.11.13 Dambi Uddo
          • 2024.12.12 Salaverria
          • 2025.01.08 Coopchebi
          • 2025.01.08 Los Bellotos
        • Coopchebi
        • Dambi Uddo
        • Filter C
        • La Esmeralda
        • Los Bellotos
        • Salaverria
        • The Steam & Co.
            • La Pavoni Double Basket
            • La Pavoni Lever 51mm IMS Competition Double Filter Basket
            • La Pavoni Single Basket
            • Wacaco Picopresso 18g Basket
          • Bottomless Portafilter
          • Coffee Puck Screen
          • IMS Pavoni Lever Precision Shower Screen 54mm
          • La Pavoni Lever 51.6 mm Adjustable Leveler
          • Orea Negotiator tool - Neon Green +
          • Pällo Grouphead Brush ++
          • Pesado 58.5 — WDT Clump Crusher - Raya+
          • Rubber Knock Box Black ++
          • Tamper
          • Wacaco Picopresso Needle Dispersion Tool
          • Walnut Tamper
          • La Pavoni Europiccola
          • Sworkdesign Dripper
          • Wacaco Picopresso
          • 1ZPresso JX-Pro
          • DF64 Gen 2
          • Brewista Smart Scale II
        • Espresso with Dambi Uddo, 1ZPresso Jx-Pro on La Pavoni Europiccola
        • Espresso with La Esmeralda, 1ZPresso Jx-Pro on La Pavoni Europiccola
        • Espresso with The Steam & Co., 1ZPresso Jx-Pro on Wacaco Picopresso
          • Testing Resistance of a Coffee Machine
          • USB Charging A and C
            • Group Head Temperature
          • Espresso Making
          • Extraction Theory
          • Making Coffee with Sworkdesign Dripper - 1
          • Making Coffee with Sworkdesign Dripper - 2
      • Coffee
        • Cheesecake agli Agrumi
          • Capitulum Primum
          • Exercitia Latina
          • Grammatica Latina
          • Pensa
          • Capitulum Secundum
          • Excercitia Latina
          • Grammatica Latina
          • Pensa
          • Grammatica Latina
          • Pensa
          • Capitulum Primum
          • Exercitia Latina
        • A Mathematical Framework for Transformers Circuits
        • Negative Results for SAEs on Downstream Tasks
        • Quantifying context mixing in transformers
        • Sparse Autoencoders
        • Direct Preference Optimization
        • Reinforcement Learning
        • Reinforcement Learning with Human Feedback
        • Functions
        • Composition of Linear Maps
        • Sets
            • Controllable Text Simplification with Deep Reinforcement Learning
          • Controlled Text Generation
          • Fine-tuning Approaches
          • Post-Processing Approaches
          • Retrain or Refactoring Approaches
          • Hallucination
          • Hallucination Causes
          • Automatic Evaluation of Simplicity
          • Human Assessment for Text Simplification
          • Learning Simplifications for Specific Target Audiences
          • Optimizing Statistical Machine Translation for Text Simplification
            • 24.11.27 - Malvina Nissim - The Language Factor
            • 24.11.27 - Valerio Basile - Modeling and Evaluation for Perspectivist NLP
            • 24.12.06 - Dieuwke Hupkes - Generalization in LLMs and Beyond
            • 25.02.24 - Takeaki Uno -
            • 25.04.04 - Outlier Dimension in LLMs and multimodal-LLMs. Mechanisms for Task Adaptation and Factual Recall - William Rudman
            • 25.03.24 - Neural Propagation in the Framework of Cognidynamics - Marco Gori
            • 25.03.24 - On Continual Learnings and the Dynamics of Forgetting - Tinne Tuytelaars
            • 25.03.25 - New avenues in Long-Sequence Processing without Attention - Antonio Orvieto
            • 25.03.25 - Toward the Post-dataset Eta. Embracing Data Streams - Alexei (Alyosha) Efros
            • 2025.02.04 - Lecture 1
            • 2025.02.06 - Lecture 2
            • 2025.01.08 - Lecture 1
            • 2025.01.08 - Lecture 2
            • 2025.04.30 - Lecture 3
          • Speech - English
          • Speech - Italian
          • Paper
          • Tweets to read
        • Index of Papers
          • @keskarCTRLConditionalTransformer2019
          • A Pre-training Based Personalized Dialogue Generation Model with Persona-sparse Data
          • A Survey of Controllable Text Generation Using Transformer-based Pre-trained Language Models
          • A Survey of Reinforcement Learning from Human Feedback
          • A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions
          • A Survey on Hallucination in Large Vision-Language Models
          • Attention is All you Need
          • Controllable Generation from Pre-trained Language Models via Inverse Prompting
          • Controllable Text Simplification with Deep Reinforcement Learning
          • Data-Driven Sentence Simplification: Survey and Benchmark
          • Direct Preference Optimization: Your Language Model is Secretly a Reward Model
          • Evaluating Text-To-Text Framework for Topic and Style Classification of Italian texts
          • Fine-tuning Language Models for Factuality
          • Fine-Tuning Language Models from Human Preferences
          • Finetuned Language Models Are Zero-Shot Learners
          • GAN-Based Interactive Reinforcement Learning from Demonstration and Human Evaluative Feedback
          • Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2
          • Learning Simplifications for Specific Target Audiences
          • Optimizing Statistical Machine Translation for Text Simplification
          • Quantifying Attention Flow in Transformers
          • Quantifying Context Mixing in Transformers
          • Scaling Instruction-Finetuned Language Models
          • Survey of Hallucination in Natural Language Generation
          • Technical Report: Auxiliary Tuning and its Application to Conditional Text Generation
          • The (Un)Suitability of Automatic Evaluation Metrics for Text Simplification
          • Training language models to follow instructions with human feedback
        • Love Hotel
        • Contextualized Adventures - Jobe Bittman
        • Players make your world spin - Mike Breault
      • Python Runnable Code Block
      • Untitled
    Home

    ❯

    Scientific Literature References

    ❯

    literature notes

    ❯

    The (Un)Suitability of Automatic Evaluation Metrics for Text Simplification

    The (Un)Suitability of Automatic Evaluation Metrics for Text Simplification

    May 06, 20251 min read

    • year : 2021
    • authors : Fernando Alva-Manchego, Carolina Scarton, Lucia Specia
    • repository :
    • proceedings :
    • journal : Computational Linguistics
    • volume : 47
    • issue : 4
    • publisher :
    • doi : 10.1162/coli_a_00418
    • Abstract : In order to simplify sentences, several rewriting operations can be performed, such as replacing complex words per simpler synonyms, deleting unnecessary information, and splitting long sentences. Despite this multi-operation nature, evaluation of automatic simplification systems relies on metrics that moderately correlate with human judgments on the simplicity achieved by executing specific operations (e.g., simplicity gain based on lexical replacements). In this article, we investigate how well existing metrics can assess sentence-level simplifications where multiple operations may have been applied and which, therefore, require more general simplicity judgments. For that, we first collect a new and more reliable data set for evaluating the correlation of metrics and human judgments of overall simplicity. Second, we conduct the first meta-evaluation of automatic metrics in Text Simplification, using our new data set (and other existing data) to analyze the variation of the correlation between metrics' scores and human judgments across three dimensions: the perceived simplicity level, the system type, and the set of references used for computation. We show that these three aspects affect the correlations and, in particular, highlight the limitations of commonly used operation-specific metrics. Finally, based on our findings, we propose a set of recommendations for automatic evaluation of multi-operation simplifications, suggesting which metrics to compute and how to interpret their scores.
    • research

    The (Un)Suitability of Automatic Evaluation Metrics for Text Simplification

    Alva-Manchego et al_2021_The (Un)Suitability of Automatic Evaluation Metrics for Text Simplification.pdf

    Notes


    Graph View

    • The (Un)Suitability of Automatic Evaluation Metrics for Text Simplification
    • Notes

    Backlinks

    • Automatic Evaluation of Simplicity
    • Human Assessment for Text Simplification

    Created with Quartz v4.4.0 © 2025

    • GitHub
    • Discord Community