long-form-generation-llm

Factuality

FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation
Language Models Hallucinate, but May Excel at Fact Verification
RAGAS: Automated Evaluation of Retrieval Augmented Generation (sentence-level generation)
FacTool: Factuality Detection in Generative AI -- A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios
Factcheck-Bench: Fine-Grained Evaluation Benchmark for Automatic Fact-checkers
Towards LLM-based Fact Verification on News Claims with a Hierarchical Step-by-Step Prompting Method
Fine-tuning Language Models for Factuality
Chain-of-Verification Reduces Hallucination in Large Language Models
SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models
RARR: Researching and Revising What Language Models Say, Using Language Models
FELM: Benchmarking Factuality Evaluation of Large Language Models
Improving Model Factuality with Fine-grained Critique-based Evaluator
Molecular Facts: Desiderata for Decontextualization in LLM Fact Verification
RAGBench: Explainable Benchmark for Retrieval-Augmented Generation Systems
FactBench: A Dynamic Benchmark for In-the-Wild Language Model Factuality Evaluation
FactAlign: Long-form Factuality Alignment of Large Language Models
Counterfactual Generation from Language Models
LongReward: Improving Long-context Large Language Models with AI Feedback
MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents

Looking beyond the surface: A challenge set for reading comprehension over multiple sentences
LONG2RAG: Evaluating Long-Context & Long-Form Retrieval-Augmented Generation with Key Point Recal (based on ELI5)
L-Eval: Instituting Standardized Evaluation for Long Context Language Models
MultiDoc2Dial: Modeling Dialogues Grounded in Multiple Documents
AQuaMuSe: Automatically Generating Datasets for Query-Based Multi-Document Summarization
ExpertQA: Expert-Curated Questions and Attributed Answers

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
README.md		README.md