Description of website
- a static website (generated by a static site generator SSG like Hugo)
- hosted on S3/CloudFront
Important relevant features:
- very fast loading speed and performance
- strictly using HTTPS
- redirects HTTP to HTTPS
- advanced security headers
- minimal/zero JS and dynamic content
This article presents a streamline set of Semrush Site Audit features, based on the design of website and ignoring trivial/futile/nonsensical checks (like text-to-html ratio). This is streamlined from [the complete set] ().
Streamlined Semrush Site Audit Features (for a Hugo-S3-CloudFront Static Site)
I. Core Functionality: Website Crawling & Analysis
- Automated Website Crawler: The engine that navigates your website to discover and analyze pages. (Essential for any audit)
- Core Issue Detection: Identifies a curated list of technical SEO and on-page issues relevant to static sites.
II. Diagnostic Output & Prioritization
- Overall Site Health Score: A percentage-based metric (0-100%) giving an overview based on the relevant checks.
- Issue Categorization (Errors, Warnings, Notices): Helps prioritize fixes for the issues that do matter for your site type.
- “Why and How to Fix” Guidance: Provides explanations and resolution advice for the identified relevant issues.
III. Key Areas of Audit & Relevant Checks
This is where we significantly tailor the focus.
- A. Crawlability & Indexability (Critical for Search Engine Visibility):
robots.txt
Analysis: Checks for syntax errors or directives accidentally blocking important content or crawlers. (Still vital)sitemap.xml
Validation: Verifies format, presence, and consistency with crawled pages; identifies pages in sitemap returning errors. (Still vital)- Crawl Depth: Identifies pages too many clicks away from the homepage, impacting discoverability and link equity flow. (Important for site architecture)
- Broken Links (Internal & External):
- Internal Broken Links (4xx): Critical for user experience and crawlability.
- External Broken Links (4xx): Impacts user trust and potentially SEO.
- Server Errors on Internal Pages (5xx): Could indicate issues with S3/CloudFront configuration for specific paths or underlying file issues.
- Redirect Issues:
- Redirect Chains & Loops: Can waste crawl budget and slow down user experience. (Even with CloudFront, misconfigurations in rules or static meta redirects can cause this)
- Orphan Pages: Locates pages with no internal links pointing to them, making them hard for users and search engines to find. (Important for content discovery)
- Blocking Directives: Identifies pages blocked by
noindex
meta tags orX-Robots-Tag
(if you use it via Lambda@Edge/CloudFront Functions). Ensures you’re not unintentionally de-indexing content.
- B. Fundamental On-Page SEO Elements (Content & Structure):
- Meta Tags:
- Title Tags: Checks for Missing, truly Duplicate (identical across distinct content pages), or significantly Too Long/Short impacting SERP display.
- Meta Descriptions: Checks for Missing, truly Duplicate, or significantly Too Long/Short impacting SERP display and click-through rates.
- Heading Tags (H1):
- Missing H1 Tags: Ensures key pages have a primary heading for structure and relevance.
- Content Uniqueness:
- Duplicate Content: Identifies pages with substantially similar content bodies (beyond just H1/title similarity), which can be a genuine issue.
- Image Accessibility & Integrity:
- Missing Alt Attributes: Crucial for accessibility and image SEO.
- Broken Images: Detects images that fail to load.
- Canonicalization:
rel="canonical"
Issues: Checks for incorrect implementation, non-indexable canonicals, or multiple canonical tags. Important if you have URL variations (e.g., parameters you don’t want indexed, or case variations if your web server/CDN doesn’t normalize them).
- Meta Tags:
- C. Basic HTTPS Integrity:
- Mixed Content: Checks for any accidental HTTP resources (images, scripts, CSS) linked from your HTTPS pages. (Even with a strong HTTPS setup, manual errors can introduce this).
- D. Internal Linking Structure (Site Architecture & Equity Flow):
- Internal Link Distribution: Highlights pages with critically low numbers of incoming internal links (potential orphans or poorly integrated content).
- Nofollow Attributes on Internal Links: Flags if you’re unintentionally using
nofollow
on internal links, blocking link equity flow.
- E. Structured Data & Markup (If Used):
- Schema.org Markup Validation: If you implement structured data for rich snippets, this checks for presence and validity.
- F. International SEO (Hreflang) (If Applicable):
hreflang
Tag Validation: If your site targets multiple languages/regions, this is critical for correct implementation.
IV. Audit Management & Customization
- A. Crawl Configuration & Scope:
- Crawl Source: Choose to crawl from website, sitemap, or a list of URLs.
- Crawl Scope: Define audit scope (entire site, subdomains, specific subfolders).
- User-Agent Selection: Can be useful for specific diagnostics.
- Robots.txt & URL Parameter Handling: Customize how the crawler interacts.
- B. Monitoring & Progress Tracking:
- Scheduled Audits: Automate audits to monitor for new relevant issues.
- Progress Tab: Track changes in Site Health Score and the count of relevant issues over time.
- Crawl Comparison: Compare results between different audit runs to see what’s been fixed or what new relevant issues have appeared.
V. Reporting & Integrations
- A. Data Export & Reporting:
- PDF Reports: Generate reports focusing on the selected relevant checks.
- Data Export (CSV/Excel): Export lists of relevant issues for action.
- B. Integrations (Limited Utility in this Streamlined View):
- Google Analytics: Can still be useful to overlay pageview data on pages with identified relevant issues to help prioritize fixes (e.g., a broken internal link on a high-traffic page).
Key Differences & Rationale for Streamlining:
- Removed Performance/CWV/Extensive HTTPS Checks: Your setup (Hugo, S3/CloudFront, pre-configured HTTPS/HSTS) makes these largely redundant for routine auditing. A basic mixed content check remains useful as a safeguard.
- Removed “Pedantic” On-Page Checks: Items like “short H1,” “duplicate H1/title,” “text-to-HTML ratio,” “low word count,” and “multiple H1s” are excluded as per your preference. The focus is on more fundamental on-page elements like missing or truly duplicate core tags.
- Emphasis on Core SEO Pillars: The streamlined list prioritizes:
- Discoverability: Can search engines find your content (
robots.txt
, sitemap, crawl depth)? - Indexability: Can they process and index it correctly (noindex, canonicals)?
- Content Integrity: Is core on-page information present and unique where it matters (titles, descriptions, H1s, alt text)?
- Site Structure & Links: Is the site well-linked internally, and are there broken pathways?
- User Experience (Basic): Broken links, broken images.
- Discoverability: Can search engines find your content (
This refined list should give you a powerful subset of Semrush’s Site Audit capabilities that directly address the most impactful SEO factors for your well-optimized static website, without the noise of checks that are less relevant or that you find unnecessary. Remember, you can often customize the checks within Semrush’s audit setup to further tailor it to your needs.