Files
c-relay/query_analysis_report.md

10 KiB

Database Query Performance Analysis Report

Analysis Date: 2026-02-02
Log Duration: ~6 minutes (15:24:50 - 15:30:58)
Total Queries: 366 queries
Data Source: serverlog.txt


Executive Summary

The relay is experiencing moderate performance issues with an average query time of 10.4ms and a maximum query time of 672ms. The primary bottlenecks are:

  1. Tag-based searches using json_each() - 53% of all queries (194/366)
  2. Monitoring system queries - Taking 540-550ms each
  3. Multiple pubkey lookups - Kind 10002 queries with 15-50 pubkeys

Query Performance Metrics

Overall Statistics

  • Total Queries: 366
  • Average Query Time: 10,440 μs (10.4 ms)
  • Minimum Query Time: 14 μs
  • Maximum Query Time: 672,846 μs (672.8 ms)
  • Slow Queries (>10ms): 8 queries (2.2%)

Query Type Breakdown

Type Count Percentage
REQ 359 98.1%
MONITOR 7 1.9%

Critical Performance Issues

1. SLOWEST QUERY: 672ms Tag Search (IP: 192.42.116.178)

SELECT id, pubkey, created_at, kind, content, sig, tags 
FROM events 
WHERE 1=1 
  AND (kind < 20000 OR kind >= 30000) 
  AND kind IN (5,6300,7000,2004,1622) 
  AND EXISTS (
    SELECT 1 FROM json_each(json(tags)) 
    WHERE json_extract(value, '$[0]') = ? 
      AND json_extract(value, '$[1]') IN (?)
  ) 
ORDER BY created_at DESC 
LIMIT 100

Problem: Full table scan with JSON parsing for every row
Impact: 672ms for 0 results (wasted computation)
Root Cause: No index on tag values, requires scanning all events


2. Monitoring System Queries: 540-550ms Each

SELECT * FROM active_subscriptions_log 
ORDER BY created_at DESC

Occurrences: 4 queries in 6 minutes
Average Time: 545ms
Rows Returned: 20-52 rows
Problem: Extremely slow for small result sets
Root Cause: Likely missing index on created_at column


3. Tag-Based Searches (json_each) - 53% of All Queries

  • Total: 194 queries (53% of all queries)
  • Pattern: EXISTS (SELECT 1 FROM json_each(json(tags)) WHERE ...)
  • Most Common: Kind 1984 (105 queries), Kind 1111 (47 queries)
  • Problem: Every tag search requires full JSON parsing

Example Pattern:

WHERE kind IN (1984) 
  AND pubkey IN (?) 
  AND EXISTS (
    SELECT 1 FROM json_each(json(tags)) 
    WHERE json_extract(value, '$[0]') = ? 
      AND json_extract(value, '$[1]') IN (?,?,?,?,?,?,?,?,?,?,?)
  )

4. Multiple Pubkey Lookups (Kind 10002)

  • Total: 64 queries for kind 10002
  • Average Time: 2,500-3,300 μs (2.5-3.3ms)
  • Pattern: Queries with 15-50 pubkeys in IN clause
  • Problem: Large IN clauses without proper indexing

Example:

WHERE kind IN (10002) 
  AND pubkey IN (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)

Client Activity Analysis

Top Query Sources (by IP)

IP Address Query Count Percentage Notes
45.84.107.222 101 27.6% Highest activity
23.234.109.54 69 18.9% Second highest
185.220.101.38 56 15.3% Third highest
192.42.116.178 51 13.9% Source of 672ms query
149.22.80.85 34 9.3%
174.138.53.241 24 6.6%
Others 31 8.5% 6 other IPs

Observation: Top 3 IPs account for 61.8% of all queries


Most Common Query Patterns

Kind Filter Query Count Percentage
kind IN (1984) 105 28.7%
kind IN (10002) 64 17.5%
kind IN (1111) 47 12.8%
kind IN (0,2,3,10002) 24 6.6%
kind IN (9735) 23 6.3%
kind IN (0,30315,10002,10050) 20 5.5%
Others 83 22.7%

Optimization Recommendations

Priority 1: Critical (Immediate Action Required)

1.1 Add Index on active_subscriptions_log.created_at

Impact: Will reduce monitoring queries from 540ms to <10ms
Effort: Low
SQL:

CREATE INDEX IF NOT EXISTS idx_active_subscriptions_created_at 
ON active_subscriptions_log(created_at DESC);

1.2 Implement Tag Indexing System

Impact: Will reduce tag searches from 100-600ms to <10ms
Effort: High
Options:

  • Option A: Create separate event_tags table with indexes
    CREATE TABLE event_tags (
      event_id TEXT NOT NULL,
      tag_name TEXT NOT NULL,
      tag_value TEXT NOT NULL,
      FOREIGN KEY (event_id) REFERENCES events(id)
    );
    CREATE INDEX idx_event_tags_lookup ON event_tags(tag_name, tag_value);
    
  • Option B: Use SQLite JSON1 extension with generated columns (if available)
  • Option C: Implement application-level tag caching

Recommended: Option A (most reliable and performant)


Priority 2: High (Implement Within Week)

2.1 Optimize Multiple Pubkey Queries

Current: pubkey IN (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)
Problem: Large IN clauses are inefficient
Solution:

  • Add composite index: CREATE INDEX idx_events_kind_pubkey ON events(kind, pubkey, created_at DESC);
  • Consider query rewriting for >10 pubkeys

2.2 Add Query Result Caching

Target Queries:

  • Kind 0 (profile) lookups - frequently repeated
  • Kind 10002 (relay lists) - relatively static
  • Kind 1984 (reports) - can be cached for 30-60 seconds

Implementation:

  • Use in-memory LRU cache (e.g., 1000 entries)
  • Cache key: hash of SQL + parameters
  • TTL: 30-60 seconds for most queries

Priority 3: Medium (Optimize Over Time)

3.1 Disable or Throttle Monitoring Queries

Current: Monitoring queries run every ~60 seconds
Impact: Each query takes 540ms
Options:

  • Increase throttle to 300 seconds (5 minutes)
  • Disable monitoring entirely if not actively used
  • Optimize active_subscriptions_log table structure

3.2 Implement Query Complexity Limits

Problem: Some queries scan entire table (e.g., 672ms query returned 0 results)
Solution:

  • Reject queries without time bounds (require since or until)
  • Limit number of kinds in single query (max 10)
  • Limit number of pubkeys in single query (max 20)

3.3 Add Composite Indexes for Common Patterns

-- For kind + created_at queries (most common pattern)
CREATE INDEX idx_events_kind_created ON events(kind, created_at DESC);

-- For kind + pubkey + created_at queries
CREATE INDEX idx_events_kind_pubkey_created ON events(kind, pubkey, created_at DESC);

Database Schema Recommendations

Current Issues

  1. No tag indexing - Forces full table scans with JSON parsing
  2. Missing created_at indexes - Monitoring queries are extremely slow
  3. No composite indexes - Multi-condition queries inefficient
-- 1. Add tag indexing table
CREATE TABLE IF NOT EXISTS event_tags (
    event_id TEXT NOT NULL,
    tag_name TEXT NOT NULL,
    tag_value TEXT NOT NULL,
    FOREIGN KEY (event_id) REFERENCES events(id) ON DELETE CASCADE
);

CREATE INDEX idx_event_tags_lookup ON event_tags(tag_name, tag_value);
CREATE INDEX idx_event_tags_event_id ON event_tags(event_id);

-- 2. Add monitoring table index
CREATE INDEX IF NOT EXISTS idx_active_subscriptions_created_at 
ON active_subscriptions_log(created_at DESC);

-- 3. Add composite indexes for common query patterns
CREATE INDEX IF NOT EXISTS idx_events_kind_created 
ON events(kind, created_at DESC);

CREATE INDEX IF NOT EXISTS idx_events_kind_pubkey_created 
ON events(kind, pubkey, created_at DESC);

-- 4. Add index for pubkey lookups
CREATE INDEX IF NOT EXISTS idx_events_pubkey_created 
ON events(pubkey, created_at DESC);

Expected Performance Improvements

Optimization Current Avg Expected Avg Improvement
Tag searches (with event_tags table) 100-600ms 5-20ms 95-97%
Monitoring queries (with index) 540ms 5-10ms 98%
Multiple pubkey queries (with index) 2.5-3.3ms 0.5-1ms 70-80%
Overall average query time 10.4ms 2-3ms 70-80%

Client Behavior Analysis

Potentially Abusive Patterns

IP: 192.42.116.178 (51 queries)

  • Issue: Generated the slowest query (672ms)
  • Pattern: Complex tag searches with multiple kinds
  • Recommendation: Monitor for repeated expensive queries

IP: 45.84.107.222 (101 queries - 27.6% of total)

  • Issue: Highest query volume
  • Pattern: Mix of kind 10002, 1984, and various other kinds
  • Recommendation: Acceptable if queries are efficient; monitor for abuse

Normal Behavior

  • Most queries are <1ms (fast)
  • Majority return 0-10 rows (reasonable)
  • Query patterns match typical Nostr client behavior

Action Plan

Immediate (Today)

  1. Add index on active_subscriptions_log.created_at
  2. Increase monitoring throttle from 60s to 300s (or disable)
  3. Monitor IP 192.42.116.178 for repeated expensive queries

This Week

  1. Design and implement event_tags table
  2. Add composite indexes for common query patterns
  3. Implement query complexity limits (require time bounds)

This Month

  1. Implement query result caching (LRU cache)
  2. Add query cost estimation and rejection
  3. Optimize subscription matching algorithm

Monitoring Recommendations

Key Metrics to Track

  1. Average query time - Target: <5ms
  2. P95 query time - Target: <50ms
  3. P99 query time - Target: <100ms
  4. Queries >100ms - Target: <1% of queries
  5. Tag search percentage - Target: <30% after optimization

Alert Thresholds

  • Critical: Average query time >20ms for 5 minutes
  • Warning: Any single query >1000ms (1 second)
  • Info: Client making >100 queries/minute

Conclusion

The relay is experiencing performance issues primarily due to:

  1. Lack of tag indexing (53% of queries affected)
  2. Missing indexes on monitoring tables (540ms queries)
  3. Inefficient multiple pubkey lookups

Implementing the recommended optimizations will reduce average query time from 10.4ms to 2-3ms (70-80% improvement) and eliminate the 500-600ms slow queries entirely.

Priority: Implement tag indexing system and add missing indexes within 1 week to prevent performance degradation as database grows.