LaTeX Reference Checker

Bash Script to Find Unused Labels and Missing Citations

Author

Dr. Md Abdus Samad

Published

January 3, 2026

Overview

NoteWhat This Tool Does

When preparing research manuscripts, it’s common to accumulate unused labels, orphaned citations, and commented-out content. This bash script provides a comprehensive audit of your LaTeX document, identifying:

  • Unreferenced figures, tables, and equations
  • Unused bibliography entries
  • Missing citations (cited but not in .bib file)
  • All while properly ignoring commented lines

Key Feature: The script preserves the sequence in which labels appear in your document, making it easier to locate and manage them.


The Problem

During manuscript development, you might encounter:

Issue Example Impact
Unreferenced labels \label{fig:old_analysis} never used Clutters document, confuses reviewers
Commented labels counted % \label{tab:removed} still detected False positives in checks
Missing citations \cite{smith2023} but not in .bib Compilation errors
Unused bibliography References added but never cited Inflates reference count

Manual checking across 50+ pages with multiple revisions becomes impractical.


The Solution

Features

Feature Description
Comment-aware Ignores both full-line and inline comments
Sequence-preserving Shows labels in document order
Comprehensive Checks figures, tables, equations, and citations
Multiple reference styles Supports \ref, \autoref, and \eqref
Lightweight Pure bash, no dependencies
Fast Processes typical manuscripts in <1 second

Installation and Usage

TipQuick Start
  1. Save the script - Copy the code below and save as check_latex_refs.sh
  2. Make it executable - Run chmod +x check_latex_refs.sh
  3. Run the checker - Execute ./check_latex_refs.sh manuscript.tex

Step 1: Create the Script

VS Code: Open integrated terminal with Ctrl+ ` (backtick)

Copy the following code and save it as check_latex_refs.sh:

#!/bin/bash

if [ $# -eq 0 ]; then
    echo "Usage: $0 <latex_file.tex>"
    exit 1
fi

TEXFILE=$1

# Remove comments in two steps:
# Step 1: Remove lines that start with % (including whitespace before %)
# Step 2: Remove inline comments (everything after % that's not \%)
TEXCONTENT=$(grep -v '^\s*%' "$TEXFILE" | sed 's/\([^\\]\)%.*$/\1/')

echo "Checking references in: $TEXFILE"
echo "========================================"

# Tables (preserving order, excluding comments)
echo -e "\n=== TABLES ==="
TAB_LABELS=$(echo "$TEXCONTENT" | grep -oP '\\label\{tab:\K[^}]+')
TAB_LABELS_UNIQ=$(echo "$TAB_LABELS" | awk '!seen[$0]++')
# Catches both \ref and \autoref
TAB_REFS=$(echo "$TEXCONTENT" | grep -oP '\\(auto)?ref\{tab:\K[^}]+' | awk '!seen[$0]++')
TAB_COUNT=$(echo "$TAB_LABELS_UNIQ" | grep -v '^$' | wc -l)
REF_COUNT=$(echo "$TAB_REFS" | grep -v '^$' | wc -l)

echo "Total labels: $TAB_COUNT"
echo "Total refs: $REF_COUNT"
echo "Unreferenced tables:"

while IFS= read -r label; do
    if ! echo "$TAB_REFS" | grep -qx "$label"; then
        echo "  tab:$label"
    fi
done <<< "$TAB_LABELS_UNIQ"

# Figures (preserving order, excluding comments)
echo -e "\n=== FIGURES ==="
FIG_LABELS=$(echo "$TEXCONTENT" | grep -oP '\\label\{fig:\K[^}]+')
FIG_LABELS_UNIQ=$(echo "$FIG_LABELS" | awk '!seen[$0]++')
# Catches both \ref and \autoref
FIG_REFS=$(echo "$TEXCONTENT" | grep -oP '\\(auto)?ref\{fig:\K[^}]+' | awk '!seen[$0]++')
FIG_COUNT=$(echo "$FIG_LABELS_UNIQ" | grep -v '^$' | wc -l)
FIGREF_COUNT=$(echo "$FIG_REFS" | grep -v '^$' | wc -l)

echo "Total labels: $FIG_COUNT"
echo "Total refs: $FIGREF_COUNT"
echo "Unreferenced figures:"

while IFS= read -r label; do
    if ! echo "$FIG_REFS" | grep -qx "$label"; then
        echo "  fig:$label"
    fi
done <<< "$FIG_LABELS_UNIQ"

# Equations (preserving order, excluding comments)
echo -e "\n=== EQUATIONS ==="
EQ_LABELS=$(echo "$TEXCONTENT" | grep -oP '\\label\{eq:\K[^}]+')
EQ_LABELS_UNIQ=$(echo "$EQ_LABELS" | awk '!seen[$0]++')
# Catches \ref, \autoref, and \eqref
EQ_REFS=$(echo "$TEXCONTENT" | grep -oP '\\((auto|eq))?ref\{eq:\K[^}]+' | awk '!seen[$0]++')
EQ_COUNT=$(echo "$EQ_LABELS_UNIQ" | grep -v '^$' | wc -l)
EQREF_COUNT=$(echo "$EQ_REFS" | grep -v '^$' | wc -l)

echo "Total labels: $EQ_COUNT"
echo "Total refs: $EQREF_COUNT"
echo "Unreferenced equations:"

while IFS= read -r label; do
    if ! echo "$EQ_REFS" | grep -qx "$label"; then
        echo "  eq:$label"
    fi
done <<< "$EQ_LABELS_UNIQ"

# Citations (excluding comments)
echo -e "\n=== CITATIONS ==="
BIBFILE=$(echo "$TEXCONTENT" | grep -oP '\\bibliography\{\K[^}]+' | head -1)

if [ -n "$BIBFILE" ]; then
    [[ "$BIBFILE" != *.bib ]] && BIBFILE="${BIBFILE}.bib"

    if [ -f "$BIBFILE" ]; then
        echo "Using bibliography file: $BIBFILE"

        BIB_ENTRIES=$(grep -oP '@\w+\{\K[^,]+' "$BIBFILE" | awk '!seen[$0]++')
        CITATIONS=$(echo "$TEXCONTENT" | grep -oP '\\cite[tp]?\{\K[^}]+' | tr ',' '\n' | sed 's/^[[:space:]]*//' | awk '!seen[$0]++')

        BIB_COUNT=$(echo "$BIB_ENTRIES" | grep -v '^$' | wc -l)
        CITE_TOTAL=$(echo "$TEXCONTENT" | grep -oP '\\cite[tp]?\{[^}]+\}' | wc -l)
        CITE_UNIQ=$(echo "$CITATIONS" | grep -v '^$' | wc -l)

        echo "Total bib entries: $BIB_COUNT"
        echo "Total citation commands: $CITE_TOTAL"
        echo "Unique cited keys: $CITE_UNIQ"

        echo "Uncited bibliography entries:"
        while IFS= read -r entry; do
            if ! echo "$CITATIONS" | grep -qx "$entry"; then
                echo "  $entry"
            fi
        done <<< "$BIB_ENTRIES"

        echo "Missing bibliography entries (cited but not in .bib):"
        while IFS= read -r cite; do
            if ! echo "$BIB_ENTRIES" | grep -qx "$cite"; then
                echo "  $cite"
            fi
        done <<< "$CITATIONS"
    else
        echo "Bibliography file not found: $BIBFILE"
    fi
else
    echo "No bibliography file found in document"
fi

echo -e "\n========================================"
echo "Check complete!"

Step 2: Make it Executable

chmod +x check_latex_refs.sh

Step 3: Run the Checker

./check_latex_refs.sh manuscript.tex

Replace manuscript.tex with your actual LaTeX filename.


Example Output

Here’s what the script reports for a typical manuscript:

Checking references in: paper.tex
========================================

=== TABLES ===
Total labels: 11
Total refs: 8
Unreferenced tables:
  tab:appendix_data
  tab:extra_results
  tab:summary_stats

=== FIGURES ===
Total labels: 6
Total refs: 6
Unreferenced figures:

=== EQUATIONS ===
Total labels: 15
Total refs: 12
Unreferenced equations:
  eq:supplementary
  eq:variance
  eq:alt_form

=== CITATIONS ===
Using bibliography file: references.bib
Total bib entries: 45
Total citation commands: 52
Unique cited keys: 38
Uncited bibliography entries:
  smith2020old
  jones2019unused
  brown2018extra
Missing bibliography entries (cited but not in .bib):
  nguyen2023missing

========================================
Check complete!
NoteReference Commands Detected

The script detects references made with \ref{fig:one}, \autoref{tab:results}, or \eqref{eq:main} - all are counted correctly.


Advanced Usage

Check Multiple Files

Create check_all.sh:

#!/bin/bash

for file in *.tex; do
    echo "File: $file"
    ./check_latex_refs.sh "$file"
    echo ""
done

Generate Timestamped Reports

# Create dated log
./check_latex_refs.sh paper.tex > "audit_$(date +%Y%m%d_%H%M%S).log"

# Compare before/after revision
./check_latex_refs.sh paper.tex > before_revision.log
# ... make changes ...
./check_latex_refs.sh paper.tex > after_revision.log
diff before_revision.log after_revision.log

Git Pre-commit Hook

Add to .git/hooks/pre-commit:

#!/bin/bash

# Run check
./check_latex_refs.sh main.tex > ref_check.log

# Count unreferenced items
UNREFERENCED=$(grep -c "^  " ref_check.log)

if [ $UNREFERENCED -gt 0 ]; then
    echo "Warning: $UNREFERENCED unreferenced items found"
    cat ref_check.log

    read -p "Continue commit? (y/n) " -n 1 -r
    echo
    if [[ ! $REPLY =~ ^[Yy]$ ]]; then
        exit 1
    fi
fi

Makefile Integration

# Makefile

.PHONY: check clean all

all: manuscript.pdf check

manuscript.pdf: manuscript.tex
    pdflatex manuscript.tex
    bibtex manuscript
    pdflatex manuscript.tex
    pdflatex manuscript.tex

check:
    @./check_latex_refs.sh manuscript.tex

clean:
    rm -f *.aux *.log *.bbl *.blg *.out *.toc

audit:
    @./check_latex_refs.sh manuscript.tex > audit_$(shell date +%Y%m%d).log
    @cat audit_$(shell date +%Y%m%d).log

Usage:

make              # Compile and check
make check        # Run reference check only
make audit        # Generate timestamped audit

Extensions and Customizations

Add Line Numbers

Modify the unreferenced item display to show line numbers:

while IFS= read -r label; do
    if ! echo "$TAB_REFS" | grep -qx "$label"; then
        LINE=$(grep -n "\\label{tab:$label}" "$TEXFILE" | cut -d: -f1)
        echo "  tab:$label (line $LINE)"
    fi
done <<< "$TAB_LABELS_UNIQ"

Output:

Unreferenced tables:
  tab:appendix_data (line 145)
  tab:extra_results (line 203)

Add Section Labels

Add this after the equations section:

# Sections (preserving order, excluding comments)
echo -e "\n=== SECTIONS ==="
SEC_LABELS=$(echo "$TEXCONTENT" | grep -oP '\\label\{sec:\K[^}]+')
SEC_LABELS_UNIQ=$(echo "$SEC_LABELS" | awk '!seen[$0]++')
SEC_REFS=$(echo "$TEXCONTENT" | grep -oP '\\ref\{sec:\K[^}]+' | awk '!seen[$0]++')
SEC_COUNT=$(echo "$SEC_LABELS_UNIQ" | grep -v '^$' | wc -l)
SECREF_COUNT=$(echo "$SEC_REFS" | grep -v '^$' | wc -l)

echo "Total section labels: $SEC_COUNT"
echo "Total section refs: $SECREF_COUNT"
echo "Unreferenced sections:"

while IFS= read -r label; do
    if ! echo "$SEC_REFS" | grep -qx "$label"; then
        echo "  sec:$label"
    fi
done <<< "$SEC_LABELS_UNIQ"

Color Output

Add ANSI color codes for better visibility:

# Add at top of script (after shebang)
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color

# Use in output
echo -e "${BLUE}=== TABLES ===${NC}"
echo "Total labels: $TAB_COUNT"
echo "Total refs: $REF_COUNT"

if [ "$TAB_COUNT" -eq "$REF_COUNT" ]; then
    echo -e "${GREEN}All tables referenced!${NC}"
else
    echo -e "${RED}Unreferenced tables:${NC}"
    while IFS= read -r label; do
        if ! echo "$TAB_REFS" | grep -qx "$label"; then
            echo -e "  ${YELLOW}tab:$label${NC}"
        fi
    done <<< "$TAB_LABELS_UNIQ"
fi

VS Code Task Integration

Add to .vscode/tasks.json:

{
  "version": "2.0.0",
  "tasks": [
    {
      "label": "Check LaTeX References",
      "type": "shell",
      "command": "./check_latex_refs.sh",
      "args": ["${file}"],
      "problemMatcher": [],
      "presentation": {
        "reveal": "always",
        "panel": "new"
      }
    }
  ]
}

Comparison with Alternatives

Tool Pros Cons Best For
This script Fast, customizable, comment-aware Single file only Quick audits, CI/CD
refcheck package LaTeX-integrated, visual markers Must recompile During writing
chktex Comprehensive linting Verbose output Deep analysis
VS Code LaTeX Workshop Real-time, editor-integrated Editor-specific Active editing

Limitations and Workarounds

WarningCurrent Limitations
  1. Single file only: Doesn’t traverse \input{} or \include{} commands
  2. Standard prefixes: Assumes tab:, fig:, eq:, sec: conventions
  3. Reference commands: Detects \ref, \autoref, and \eqref
  4. Citation commands: Only detects \cite, \citep, \citet

Workarounds

For multi-file projects:

# Combine files first
cat main.tex chapter*.tex appendix.tex > combined.tex
./check_latex_refs.sh combined.tex
rm combined.tex

For custom prefixes:

# Modify the grep patterns in the script
# Change tab: to tbl:
grep -oP '\\label\{tbl:\K[^}]+'

For biblatex:

# Change BIBFILE detection:
BIBFILE=$(echo "$TEXCONTENT" | grep -oP '\\addbibresource\{\K[^}]+' | head -1)

Troubleshooting

Issue: Script shows no output

# Check file exists and has content
ls -la manuscript.tex
wc -l manuscript.tex

# Check for labels
grep "\\label{" manuscript.tex | head -5

# Run with debug mode
bash -x check_latex_refs.sh manuscript.tex

Issue: Bibliography not found

# Check bibliography command exists
grep "bibliography" manuscript.tex

# Verify .bib file location
ls -la *.bib

Issue: Commented lines still appear

# Try alternative comment removal
TEXCONTENT=$(grep -v '^\s*%' "$TEXFILE" | sed 's/[^\\]%.*$//')

# Or use Perl
TEXCONTENT=$(perl -pe 's/([^\\])%.*$/$1/' "$TEXFILE")

Quick Reference

Task Command
Run checker ./check_latex_refs.sh manuscript.tex
Make executable chmod +x check_latex_refs.sh
Save to log ./check_latex_refs.sh paper.tex > audit.log
Check multiple for f in *.tex; do ./check_latex_refs.sh "$f"; done
Debug mode bash -x check_latex_refs.sh manuscript.tex
TipKey Takeaways
  • Run the checker before every submission
  • Use timestamped logs to track cleanup progress
  • Integrate with Git hooks for automatic checking
  • Customize prefixes for your project conventions