Code Health
Octokraft continuously measures the health of your codebase and distills it into a single score. That score is not a vague sentiment — it is a precise, auditable number built from every issue detected across your repositories. This page explains exactly how that number is calculated, what the categories mean, and how to interpret the results on your dashboard.How Scoring Works
Size Normalization
Scores are normalized by codebase size. Octokraft measures issues per thousand lines of code (KLOC), so a 100,000-line repository with 50 issues scores the same as a 10,000-line repository with 5 issues. This prevents large codebases from being unfairly penalized and makes scores comparable across projects of different sizes.The Scoring Curve
Health scores use a sigmoid curve:- Resilience at the top. A single bad issue does not tank your score to zero. Scores decline gradually, then plateau. A codebase with one critical vulnerability is not equally bad as one with fifty.
- Diminishing returns. Going from a score of 90 to 95 requires more effort than going from 50 to 55. The last few points reflect genuine excellence, not just the absence of obvious problems.
Overall Score
Your overall health score is a weighted average of all 8 category scores. Categories that represent higher risk to your system carry more weight.The 8 Categories
Every issue detected in your codebase is classified into one of 8 categories. Each category has a weight that reflects its impact on system reliability, and a baseline threshold that defines the acceptable issue density.Security
Weight: 2.0x -- Highest
Weight: 2.0x -- Highest
What it measures: Vulnerabilities that could compromise your system, expose user data, or allow unauthorized access.Why it is weighted highest: A single security vulnerability can compromise your entire system. No amount of clean code or good architecture matters if an attacker can bypass authentication or inject malicious queries. Critical security issues have zero tolerance in scoring — even one causes significant score impact.Issues in this category:
- SQL injection and other injection attacks
- Hardcoded credentials and secrets in source code
- Missing input validation on user-facing endpoints
- Insecure deserialization
- Exposed API keys or tokens
- Authentication and authorization bypasses
- Cross-site scripting (XSS) vectors
Runtime Risks
Weight: 1.5x -- High
Weight: 1.5x -- High
What it measures: Code patterns that can crash your application or cause failures in production.Why it is weighted high: Runtime issues are the difference between software that works and software that fails under real conditions. Unlike code smells which slow development, runtime risks cause outages.Issues in this category:
- Null pointer dereferences and unhandled nil access
- Unhandled exceptions and missing error propagation
- Resource leaks — unclosed database connections, file handles, network sockets
- Race conditions in concurrent code
- Type coercion errors that fail silently
- Infinite loops and unbounded recursion
- Memory leaks from retained references
Test Coverage
Weight: 1.5x -- High
Weight: 1.5x -- High
What it measures: Whether your tests actually verify the behavior of your code — not just whether tests exist.Why it is weighted high: Untested code breaks in production without warning. Tests are your safety net for every future change. A codebase without meaningful test coverage cannot be changed with confidence.What Octokraft evaluates:
- Test presence per module — do modules that contain business logic also contain tests?
- Structural coverage — what percentage of exported functions are exercised by test callers?
- Assertion density — how many assertions per test function? A test with no assertions verifies nothing.
- Mock usage patterns — are tests isolated, or do they depend on external services?
Code Smells
Weight: 1.0x -- Baseline
Weight: 1.0x -- Baseline
What it measures: Anti-patterns that slow development and increase the probability of bugs.Why it is baseline weight: Code smells do not directly cause production failures, but they make your codebase harder to understand, harder to change, and more likely to accumulate bugs over time. They represent the chronic condition of technical debt.Issues in this category:
- God classes — classes with too many methods or responsibilities
- Deeply nested conditionals that are difficult to reason about
- Overly complex functions (high cyclomatic complexity)
- Long parameter lists that indicate poor abstraction
- Feature envy — functions that use more data from another module than their own
- Primitive obsession — using raw strings and integers where domain types should exist
Duplication
Weight: 0.8x -- Moderate
Weight: 0.8x -- Moderate
What it measures: Repeated code that should be consolidated.Why it is weighted lower: Duplicated code is technical debt, but it does not directly cause bugs. The risk is indirect: when a bug is fixed in one copy but not the other, or when behavior diverges between copies over time.Issues in this category:
- Copy-paste code blocks across files
- Similar logic repeated with minor variations
- Redundant implementations of the same algorithm
- Duplicated validation rules across endpoints
Dead Code
Weight: 0.8x -- Moderate
Weight: 0.8x -- Moderate
What it measures: Code that exists but is never executed.Why it is weighted lower: Dead code does not cause bugs — it causes confusion. Developers waste time reading, maintaining, and working around code that serves no purpose. It inflates the codebase and creates misleading search results.How Octokraft detects it: Octokraft builds a call graph across your codebase. If an exported function has no callers and no references anywhere in the project, it is flagged as dead code. Internal or private functions are not flagged, since they may be intended for near-term use.Issues in this category:
- Functions that are never called from anywhere in the codebase
- Variables assigned but never read
- Unreachable code paths after early returns or throws
- Unused imports and dependencies
- Feature-flagged code where the flag has been permanently disabled
formatCurrency() was replaced by a library call six months ago. The original function still exists, still shows up in search results, and new developers occasionally call it instead of the library version.Consistency
Weight: 0.7x -- Lower
Weight: 0.7x -- Lower
What it measures: Whether your codebase follows its own established patterns.Why it is weighted lower: Inconsistency rarely causes bugs directly, but it significantly slows onboarding, increases cognitive load, and makes code review harder. When every file handles errors differently, developers cannot build reliable intuitions about how the codebase behaves.Issues in this category:
- Mixed naming conventions (camelCase in some files, snake_case in others)
- Inconsistent error handling patterns across modules
- Different approaches to the same problem in different parts of the codebase
- Naming patterns that deviate from established project conventions
Compliance
Weight: 0.5x -- Lowest
Weight: 0.5x -- Lowest
What it measures: Adherence to framework best practices and coding standards.Why it is weighted lowest: Standards violations are the least likely to cause production issues. They represent deviations from recommended practices rather than concrete risks. However, they still matter for long-term maintainability.Issues in this category:
- Framework best practice violations
- Coding standard deviations
- Missing or incomplete documentation on public APIs
- Deprecated API usage
Severity Levels
Every issue has a severity level that determines how heavily it impacts your health score. Severity carries a penalty multiplier:| Severity | Multiplier | Meaning |
|---|---|---|
| Critical | 5.0x | Must fix immediately. Data loss, security breach, or system crash. |
| High | 3.0x | Should fix before merge. Significant bugs or performance issues. |
| Medium | 1.5x | Fix when possible. Code quality issues that accumulate over time. |
| Low | 1.0x | Nice to fix. Minor improvements that incrementally improve the codebase. |
| Info | 0.3x | Awareness only. Suggestions and observations, not problems. |
Grades
Your health score maps to a letter grade:| Grade | Score Range |
|---|---|
| A+ | 95 — 100 |
| A | 90 — 94 |
| A- | 85 — 89 |
| B+ | 80 — 84 |
| B | 75 — 79 |
| B- | 70 — 74 |
| C+ | 65 — 69 |
| C | 60 — 64 |
| C- | 55 — 59 |
| D+ | 50 — 54 |
| D | 45 — 49 |
| D- | 40 — 44 |
| F | Below 40 |
When Assessments Run
Initialization
When you first connect repositories to Octokraft, a full assessment runs across the entire codebase. This establishes your baseline scores.
On PR merge
When a pull request merges to your default branch, Octokraft runs an incremental assessment that analyzes only the changed files and recalculates scores. This keeps your dashboard current without re-analyzing the entire codebase.
Drift Alerts
When your health score drops between assessments, Octokraft generates a drift alert. Each alert shows:- The score before and after the change
- Which categories were affected
- Which specific issues caused the decline
- The pull request or event that triggered the change
Dismiss Rules
Not every finding applies to every codebase. Some issues are intentional trade-offs, legacy code scheduled for removal, or patterns specific to your domain. Dismiss rules let you suppress specific findings without losing visibility.Global Dismiss
Suppress a specific issue type everywhere in the project. Use this for patterns you have deliberately chosen to allow.
File-Scoped Dismiss
Suppress an issue only for a specific file. Use this for legacy code, generated files, or modules scheduled for replacement.
Dismissed issues are still detected and recorded for audit purposes. They simply do not count toward your health scores. You can review and reverse dismiss rules at any time.
Where Findings Come From
Issues on your dashboard come from 5 complementary sources, all merged into a unified view:| Source | What It Detects |
|---|---|
| Static analyzers | Language-specific linting rules, syntax issues, known anti-patterns |
| AI analysis | Semantic issues that require understanding intent — logic errors, missing edge cases, architectural concerns |
| Code graph | Structural problems detected from cross-file analysis — dead code, circular dependencies, coupling metrics |
| Convention deviations | Code that breaks patterns established elsewhere in your codebase |
| Architecture review | System-level design issues — modularity violations, scaling bottlenecks, pattern inconsistencies |
What Gets Ignored
Standard non-source directories are automatically excluded from analysis:node_modules/,.venv/,vendor/— dependency directoriesdist/,build/,target/,bin/— build output__pycache__/,.pytest_cache/— runtime cachescoverage/— test coverage output.git/— version control internals