what the scanners catch
pq-share runs Semgrep and CodeQL on every push. The findings are open. This page reports them honestly — what was caught, what was missed, and what the gap teaches about the limits of pattern-based SAST when the application's "weakness" is a deliberate policy choice rather than a syntactic mistake.
the setup
Two static analyzers run as GitHub Actions workflows on every
push to main and on every pull request:
- Semgrep with
--config=auto, which pulls a registry-managed rule set covering common security and correctness patterns across many languages. - CodeQL with the
security-extendedquery suite, run separately for Python and JavaScript/TypeScript.
Both write SARIF (and JSON, in Semgrep's case) and upload the results as workflow artifacts. The reports are public — anyone can download the artifacts from any run, since the repository is public.
Why this matters for pq-share specifically: the application
intentionally exposes weak primitives in its picker.
Selecting AES-128-GCM or Ed25519 alone or
X25519-only is not a bug — it's the demonstration.
A naive read of the SAST output would say "look at all these
weak ciphers!" The honest read, below, is more nuanced.
Numbers below are from the original run on commit
f97d027 (May 2026), which is the run that motivated
this page. Both CodeQL findings were addressed in
a27fb8d
and 26d8c20;
the latest run shows 0 CodeQL findings, 4 Semgrep
findings (all the same false-positive rule applied to the
rename migration's hardcoded inputs). The walk-through
below preserves the original findings for educational value.
--config=autosecurity-extendedFour initial findings total across two scanners and roughly four thousand lines of code. By the standards of "auto-config Semgrep on a real codebase," that's quiet. By the standards of "demonstration application that intentionally offers weak cryptography for educational purposes," it's notable for what didn't appear.
what they caught
One finding at a time, with the verdict and the reasoning. Three labels are used: tp for true positive (real issue), fp for false positive in this context, and policy for findings that depend on intent (the syntax is fine; the question is whether using this primitive for this purpose is appropriate).
The flagged code is the in-place migration runner:
async def _apply_column_additions(conn) -> None:
for table, column, ddl in _PENDING_COLUMNS:
rows = await conn.execute(text(f"PRAGMA table_info({table})"))
existing = {row[1] for row in rows.fetchall()}
if column not in existing:
await conn.execute(text(f"ALTER TABLE {table} ADD COLUMN {column} {ddl}"))
Both the table and column names come from
_PENDING_COLUMNS, a hardcoded constant tuple
defined ten lines above. There is no path by which user input
reaches the f-string. Semgrep's rule is taint-style but it
flags the construct without inferring the source.
Why the rule is still valuable: if a future
contributor refactored _PENDING_COLUMNS to be
loaded from a config file, or if the migration grew to support
user-named tables, this exact site would become exploitable.
The rule is correctly drawing a line at the construct, not the
current data flow.
The flagged line:
log.info("sent email to=%s subject=%r via %s", to, subject, settings.smtp_host)
Both to (a recipient email address) and
subject (currently a server-constructed string,
but derived from sender email and file metadata) reach this
log call. A malicious recipient address containing a newline
could forge a fake log line that looks like another event —
a classic log-injection pattern. Severity is low: the affected
log channel is structured and the impact is limited to log
readability, but the rule is correct that user-controlled data
is reaching a structured-log sink without sanitization.
Mitigation: either sanitize newlines from log fields explicitly, switch to a JSON log format that escapes them by construction, or use a logging library that handles this at the formatter level. We'll do one of these in a follow-up commit.
The flagged line:
sessionStorage.setItem(SS.WRAPPED_BLOB, wrappedB64);
The data flow here is
resp.wrapped_priv_password →
wrappedB64 → sessionStorage. CodeQL
applies a heuristic: variables with names like
password, secret, or
credential shouldn't end up in storage that
isn't explicitly designated for them.
The variable name is misleading. The value is not a password.
It is the AES-GCM ciphertext of the user's private-key bundle,
encrypted under a key derived from the password via Argon2id.
The actual password never leaves the form input; the actual
wrapped key never leaves the user's browser. Storing the
already-encrypted bundle in sessionStorage is
the design — it lets the user reload the tab and re-derive
wrapKey with their password rather than having to
log in again.
The rule is right; the variable is misleadingly
named. Renaming
wrapped_priv_password to
wrapped_priv_blob would silence the heuristic
without changing any behavior. We may make that change.
Both CodeQL findings are now fixed.
a27fb8d
renamed wrapped_priv_password (column, API
fields, JS callers) to wrapped_priv_blob with an
idempotent RENAME COLUMN migration — the
clear-text-storage finding cleared on the next CI run.
The first attempt at the log-injection fix routed user input through a control-char sanitizer helper, but CodeQL's taint analysis didn't recognize the helper as a sanitizer and the finding returned at the new line. 26d8c20 took the cleaner approach: drop user-controlled fields from the log line entirely. The database is the authoritative audit trail; the log only needs to confirm SMTP succeeded. After 26d8c20, both CodeQL files report 0 findings.
The Semgrep avoid-sqlalchemy-text findings —
4 in the migration helpers — were initially left as-is so
the alarm would re-fire if anyone fed user input into the
DDL builders. After the workflow's failure was raising real
noise on every push, we tried per-line
# nosemgrep annotations (both rule-id-suffixed
and bare) with a justifying comment block. Neither form was
honored by semgrep ci nor
semgrep scan — the findings still appeared
as "blocking". We landed on
--exclude-rule=…avoid-sqlalchemy-text… in the
Semgrep workflow command. The rule is now skipped globally
rather than per-line; in exchange, a future
text(f"…") call elsewhere in the code would
slip through. That tradeoff is acceptable here because the
rule fires on the construct regardless of data flow, so
every hit would have to be triaged manually anyway.
Honest accounting: the chain of attempts was
5f1c09a
(rule-id-suffixed nosemgrep, ignored) →
e362586
(bare nosemgrep, ignored) →
4b0fd17
(switched to semgrep scan --error, exit-code
elevation also failed) →
43bf3a2
(workflow-level exclude — clean). The empirical lesson:
inline Semgrep suppression is unreliable when running
--config=auto without a Semgrep Cloud token,
and the only mechanism that works is dropping the rule at
the CLI invocation.
what they missed
pq-share's picker offers five options for key exchange and five
for digital signatures. A naive observer would expect a SAST
scan to flag most of these — they include
X25519 alone (no PQ hedge),
ECDSA-P256 (broken by Shor's algorithm in
polynomial time on a CRQC),
AES-128-GCM (Grover-reducible to 64-bit effective
security), and so on. Zero of those primitives were flagged by
either scanner.
Why? Because the picker uses every primitive correctly. Each call site is a well-formed invocation of a vetted library:
// frontend/crypto.js
export function ed25519Sign(message, priv) { return ed25519.sign(message, priv); }
export function aesGcmSeal(key, plaintext) {
const cryptoKey = await crypto.subtle.importKey("raw", key, "AES-GCM", false, ["encrypt"]);
const iv = crypto.getRandomValues(new Uint8Array(12));
...
}
There is no AES.MODE_ECB, no hardcoded key, no
reused IV, no missing authentication tag. SAST tools detect
syntactic patterns — constructs that are wrong on
their face. The weakness pq-share is demonstrating is at a
different layer: it's a policy question of whether a
given primitive is appropriate for a given threat model.
Ed25519 is a perfectly correct EdDSA implementation; it is also
forgeable by a sufficiently large quantum computer. Both
statements are true. SAST tools can verify the first, not the
second.
Caught (syntactic)
text(f"...")— possible SQL injection vector- User input in log strings — log injection
- Sensitive-named variable in storage — heuristic match
- (Would catch: hardcoded keys, reused IVs, ECB mode, deprecated algorithms like MD5, missing auth tags, hardcoded JWT secrets, etc.)
Missed (policy)
- Picker offering
X25519alone (no PQ hedge) - Picker offering
secp384r1alone (no PQ hedge) - Picker offering
Ed25519alone - Picker offering
ECDSA-P384alone - Picker offering
AES-128-GCM(Grover concern) - Bundle re-encrypted under deterministic
wrapKey— no forward secrecy on long-term storage - Same recovery code grants full account access — single-factor recovery
Each item in the right column is a real consideration in
pq-share's threat model — and each is documented openly in the
user guide's threat
model section. The right tool to catch them is not a SAST
scanner; it's an explicit policy: "this codebase declares
what it considers acceptable, and any contributor can read the
list." pq-share's CryptoSuite Pydantic schema
and PRESET_LOCKS table do exactly that. They
enumerate which primitive combinations are permitted under
which regulatory claim, and the validator rejects any upload
that lies about which preset it satisfies.
what this teaches
Three things, in increasing order of nuance.
1. SAST is high-leverage for syntactic mistakes
Of pq-share's four findings, two correspond to real patterns
that would become bugs under predictable refactors
(the text() migration code) or would benefit from a
structural mitigation (log sanitization). One was a heuristic
match driven by misleading naming. None of these would have
surfaced without a scanner. Run them. Triage the output.
Don't ignore findings just because the current call site
happens to be safe.
2. SAST cannot reason about appropriateness
The same scanners would, on most production codebases, generate dozens or hundreds of findings about weak crypto patterns — but only when the patterns are syntactic. A correct use of a weak primitive looks identical, on the wire and in the AST, to a correct use of a strong one. The decision lives in the threat model, not the source. Pair SAST with explicit policy enumeration.
3. The educational value of pq-share's findings is the gap
pq-share's whole point is that the choice of cipher suite is the subject of the demonstration. The picker exposes every primitive openly. Off-the-shelf SAST does not flag any of them, because there's nothing wrong with how they're used — only with whether they should be used at all, and that's not a question SAST is built to answer.
If you're shipping a real product, the lesson is to add
project-specific rules — Semgrep can write custom YAML rules
against the abstract syntax tree, CodeQL has its own query
language for this — that encode your policy. Forbid
the picker's "AES-128-GCM" string literal in
production code paths. Forbid the Ed25519 sole-sig
preset in any preset that isn't labeled "demo." The scanner
becomes a guardrail for human judgment, not a substitute for
it.
reproduce locally
Both scanners run on every push to the public repo. To run the same scans locally:
Semgrep
pip install semgrep
cd path/to/pq-share
semgrep --config=auto --json-output=semgrep.json --sarif-output=semgrep.sarif
CodeQL
# Install the CodeQL CLI from
# https://docs.github.com/en/code-security/codeql-cli
cd path/to/pq-share
codeql database create --language=python --source-root=. db-py
codeql database analyze --format=sarif-latest --output=python.sarif \
db-py codeql/python-queries:codeql-suites/python-security-extended.qls
codeql database create --language=javascript --source-root=. db-js
codeql database analyze --format=sarif-latest --output=javascript.sarif \
db-js codeql/javascript-queries:codeql-suites/javascript-security-extended.qls
From CI artifacts
Or download the latest CI artifacts directly with the GitHub CLI:
gh run list --limit 5 -R harpoonsecurity/pq-share
gh run download <run-id> -R harpoonsecurity/pq-share
That writes four directories — semgrep-results-json/,
semgrep-results-sarif/,
codeql-results-python/,
codeql-results-javascript-typescript/ — each with
the corresponding raw output. Artifacts are retained for 90
days per run.
see also
- pq-share user guide — the application itself, eight keypairs, picker walkthrough, threat model
- Semgrep workflow definition
- CodeQL workflow definition
- Semgrep registry — the rules
--config=autopulls from - CodeQL query help — what each rule looks for, with examples
- CodeQL CLI — for running the analysis locally