Basic relay functionality completed
This commit is contained in:
337
docs/advanced_schema_design.md
Normal file
337
docs/advanced_schema_design.md
Normal file
@@ -0,0 +1,337 @@
|
||||
# Advanced Nostr Relay Schema Design
|
||||
|
||||
## Overview
|
||||
|
||||
This document outlines the design for an advanced multi-table schema that enforces Nostr protocol compliance at the database level, with separate tables for different event types based on their storage and replacement characteristics.
|
||||
|
||||
## Event Type Classification
|
||||
|
||||
Based on the Nostr specification, events are classified into four categories:
|
||||
|
||||
### 1. Regular Events
|
||||
- **Kinds**: `1000 <= n < 10000` || `4 <= n < 45` || `n == 1` || `n == 2`
|
||||
- **Storage Policy**: All events stored permanently
|
||||
- **Examples**: Text notes (1), Reposts (6), Reactions (7), Direct Messages (4)
|
||||
|
||||
### 2. Replaceable Events
|
||||
- **Kinds**: `10000 <= n < 20000` || `n == 0` || `n == 3`
|
||||
- **Storage Policy**: Only latest per `(pubkey, kind)` combination
|
||||
- **Replacement Logic**: Latest `created_at`, then lowest `id` lexically
|
||||
- **Examples**: Metadata (0), Contacts (3), Mute List (10000)
|
||||
|
||||
### 3. Ephemeral Events
|
||||
- **Kinds**: `20000 <= n < 30000`
|
||||
- **Storage Policy**: Not expected to be stored (optional temporary storage)
|
||||
- **Examples**: Typing indicators, presence updates, ephemeral messages
|
||||
|
||||
### 4. Addressable Events
|
||||
- **Kinds**: `30000 <= n < 40000`
|
||||
- **Storage Policy**: Only latest per `(pubkey, kind, d_tag)` combination
|
||||
- **Replacement Logic**: Same as replaceable events
|
||||
- **Examples**: Long-form content (30023), Application-specific data
|
||||
|
||||
## SQLite JSON Capabilities Research
|
||||
|
||||
SQLite provides powerful JSON functions that could be leveraged for tag storage:
|
||||
|
||||
### Core JSON Functions
|
||||
```sql
|
||||
-- Extract specific values
|
||||
json_extract(column, '$.path')
|
||||
|
||||
-- Iterate through arrays
|
||||
json_each(json_array_column)
|
||||
|
||||
-- Flatten nested structures
|
||||
json_tree(json_column)
|
||||
|
||||
-- Validate JSON structure
|
||||
json_valid(column)
|
||||
|
||||
-- Array operations
|
||||
json_array_length(column)
|
||||
json_extract(column, '$[0]') -- First element
|
||||
```
|
||||
|
||||
### Tag Query Examples
|
||||
|
||||
#### Find all 'e' tag references:
|
||||
```sql
|
||||
SELECT
|
||||
id,
|
||||
json_extract(value, '$[1]') as referenced_event_id,
|
||||
json_extract(value, '$[2]') as relay_hint,
|
||||
json_extract(value, '$[3]') as marker
|
||||
FROM events, json_each(tags)
|
||||
WHERE json_extract(value, '$[0]') = 'e';
|
||||
```
|
||||
|
||||
#### Find events with specific hashtags:
|
||||
```sql
|
||||
SELECT id, content
|
||||
FROM events, json_each(tags)
|
||||
WHERE json_extract(value, '$[0]') = 't'
|
||||
AND json_extract(value, '$[1]') = 'bitcoin';
|
||||
```
|
||||
|
||||
#### Extract 'd' tag for addressable events:
|
||||
```sql
|
||||
SELECT
|
||||
id,
|
||||
json_extract(value, '$[1]') as d_tag_value
|
||||
FROM events, json_each(tags)
|
||||
WHERE json_extract(value, '$[0]') = 'd'
|
||||
LIMIT 1;
|
||||
```
|
||||
|
||||
### JSON Functional Indexes
|
||||
```sql
|
||||
-- Index on hashtags
|
||||
CREATE INDEX idx_hashtags ON events(
|
||||
json_extract(tags, '$[*][1]')
|
||||
) WHERE json_extract(tags, '$[*][0]') = 't';
|
||||
|
||||
-- Index on 'd' tags for addressable events
|
||||
CREATE INDEX idx_d_tags ON events_addressable(
|
||||
json_extract(tags, '$[*][1]')
|
||||
) WHERE json_extract(tags, '$[*][0]') = 'd';
|
||||
```
|
||||
|
||||
## Proposed Schema Design
|
||||
|
||||
### Option 1: Separate Tables with JSON Tags
|
||||
|
||||
```sql
|
||||
-- Regular Events (permanent storage)
|
||||
CREATE TABLE events_regular (
|
||||
id TEXT PRIMARY KEY,
|
||||
pubkey TEXT NOT NULL,
|
||||
created_at INTEGER NOT NULL,
|
||||
kind INTEGER NOT NULL,
|
||||
content TEXT NOT NULL,
|
||||
sig TEXT NOT NULL,
|
||||
tags JSON,
|
||||
first_seen INTEGER DEFAULT (strftime('%s', 'now')),
|
||||
CONSTRAINT kind_regular CHECK (
|
||||
(kind >= 1000 AND kind < 10000) OR
|
||||
(kind >= 4 AND kind < 45) OR
|
||||
kind = 1 OR kind = 2
|
||||
)
|
||||
);
|
||||
|
||||
-- Replaceable Events (latest per pubkey+kind)
|
||||
CREATE TABLE events_replaceable (
|
||||
pubkey TEXT NOT NULL,
|
||||
kind INTEGER NOT NULL,
|
||||
id TEXT NOT NULL,
|
||||
created_at INTEGER NOT NULL,
|
||||
content TEXT NOT NULL,
|
||||
sig TEXT NOT NULL,
|
||||
tags JSON,
|
||||
replaced_at INTEGER DEFAULT (strftime('%s', 'now')),
|
||||
PRIMARY KEY (pubkey, kind),
|
||||
CONSTRAINT kind_replaceable CHECK (
|
||||
(kind >= 10000 AND kind < 20000) OR
|
||||
kind = 0 OR kind = 3
|
||||
)
|
||||
);
|
||||
|
||||
-- Ephemeral Events (temporary/optional storage)
|
||||
CREATE TABLE events_ephemeral (
|
||||
id TEXT PRIMARY KEY,
|
||||
pubkey TEXT NOT NULL,
|
||||
created_at INTEGER NOT NULL,
|
||||
kind INTEGER NOT NULL,
|
||||
content TEXT NOT NULL,
|
||||
sig TEXT NOT NULL,
|
||||
tags JSON,
|
||||
expires_at INTEGER DEFAULT (strftime('%s', 'now', '+1 hour')),
|
||||
CONSTRAINT kind_ephemeral CHECK (
|
||||
kind >= 20000 AND kind < 30000
|
||||
)
|
||||
);
|
||||
|
||||
-- Addressable Events (latest per pubkey+kind+d_tag)
|
||||
CREATE TABLE events_addressable (
|
||||
pubkey TEXT NOT NULL,
|
||||
kind INTEGER NOT NULL,
|
||||
d_tag TEXT NOT NULL,
|
||||
id TEXT NOT NULL,
|
||||
created_at INTEGER NOT NULL,
|
||||
content TEXT NOT NULL,
|
||||
sig TEXT NOT NULL,
|
||||
tags JSON,
|
||||
replaced_at INTEGER DEFAULT (strftime('%s', 'now')),
|
||||
PRIMARY KEY (pubkey, kind, d_tag),
|
||||
CONSTRAINT kind_addressable CHECK (
|
||||
kind >= 30000 AND kind < 40000
|
||||
)
|
||||
);
|
||||
```
|
||||
|
||||
### Indexes for Performance
|
||||
|
||||
```sql
|
||||
-- Regular events indexes
|
||||
CREATE INDEX idx_regular_pubkey ON events_regular(pubkey);
|
||||
CREATE INDEX idx_regular_kind ON events_regular(kind);
|
||||
CREATE INDEX idx_regular_created_at ON events_regular(created_at);
|
||||
CREATE INDEX idx_regular_kind_created_at ON events_regular(kind, created_at);
|
||||
|
||||
-- Replaceable events indexes
|
||||
CREATE INDEX idx_replaceable_created_at ON events_replaceable(created_at);
|
||||
CREATE INDEX idx_replaceable_id ON events_replaceable(id);
|
||||
|
||||
-- Ephemeral events indexes
|
||||
CREATE INDEX idx_ephemeral_expires_at ON events_ephemeral(expires_at);
|
||||
CREATE INDEX idx_ephemeral_pubkey ON events_ephemeral(pubkey);
|
||||
|
||||
-- Addressable events indexes
|
||||
CREATE INDEX idx_addressable_created_at ON events_addressable(created_at);
|
||||
CREATE INDEX idx_addressable_id ON events_addressable(id);
|
||||
|
||||
-- JSON tag indexes (examples)
|
||||
CREATE INDEX idx_regular_e_tags ON events_regular(
|
||||
json_extract(tags, '$[*][1]')
|
||||
) WHERE json_extract(tags, '$[*][0]') = 'e';
|
||||
|
||||
CREATE INDEX idx_regular_p_tags ON events_regular(
|
||||
json_extract(tags, '$[*][1]')
|
||||
) WHERE json_extract(tags, '$[*][0]') = 'p';
|
||||
```
|
||||
|
||||
### Option 2: Unified Tag Table Approach
|
||||
|
||||
```sql
|
||||
-- Unified tag storage (alternative to JSON)
|
||||
CREATE TABLE tags_unified (
|
||||
event_id TEXT NOT NULL,
|
||||
event_type TEXT NOT NULL, -- 'regular', 'replaceable', 'ephemeral', 'addressable'
|
||||
tag_index INTEGER NOT NULL, -- Position in tag array
|
||||
name TEXT NOT NULL,
|
||||
value TEXT NOT NULL,
|
||||
param_2 TEXT, -- Third element if present
|
||||
param_3 TEXT, -- Fourth element if present
|
||||
param_json TEXT, -- JSON for additional parameters
|
||||
PRIMARY KEY (event_id, tag_index)
|
||||
);
|
||||
|
||||
CREATE INDEX idx_tags_name_value ON tags_unified(name, value);
|
||||
CREATE INDEX idx_tags_event_type ON tags_unified(event_type);
|
||||
```
|
||||
|
||||
## Implementation Strategy
|
||||
|
||||
### 1. Kind Classification Function (C Code)
|
||||
```c
|
||||
typedef enum {
|
||||
EVENT_TYPE_REGULAR,
|
||||
EVENT_TYPE_REPLACEABLE,
|
||||
EVENT_TYPE_EPHEMERAL,
|
||||
EVENT_TYPE_ADDRESSABLE,
|
||||
EVENT_TYPE_INVALID
|
||||
} event_type_t;
|
||||
|
||||
event_type_t classify_event_kind(int kind) {
|
||||
if ((kind >= 1000 && kind < 10000) ||
|
||||
(kind >= 4 && kind < 45) ||
|
||||
kind == 1 || kind == 2) {
|
||||
return EVENT_TYPE_REGULAR;
|
||||
}
|
||||
|
||||
if ((kind >= 10000 && kind < 20000) ||
|
||||
kind == 0 || kind == 3) {
|
||||
return EVENT_TYPE_REPLACEABLE;
|
||||
}
|
||||
|
||||
if (kind >= 20000 && kind < 30000) {
|
||||
return EVENT_TYPE_EPHEMERAL;
|
||||
}
|
||||
|
||||
if (kind >= 30000 && kind < 40000) {
|
||||
return EVENT_TYPE_ADDRESSABLE;
|
||||
}
|
||||
|
||||
return EVENT_TYPE_INVALID;
|
||||
}
|
||||
```
|
||||
|
||||
### 2. Replacement Logic for Replaceable Events
|
||||
```sql
|
||||
-- Trigger for replaceable events
|
||||
CREATE TRIGGER replace_event_on_insert
|
||||
BEFORE INSERT ON events_replaceable
|
||||
FOR EACH ROW
|
||||
WHEN EXISTS (
|
||||
SELECT 1 FROM events_replaceable
|
||||
WHERE pubkey = NEW.pubkey AND kind = NEW.kind
|
||||
)
|
||||
BEGIN
|
||||
DELETE FROM events_replaceable
|
||||
WHERE pubkey = NEW.pubkey
|
||||
AND kind = NEW.kind
|
||||
AND (
|
||||
created_at < NEW.created_at OR
|
||||
(created_at = NEW.created_at AND id > NEW.id)
|
||||
);
|
||||
END;
|
||||
```
|
||||
|
||||
### 3. D-Tag Extraction for Addressable Events
|
||||
```c
|
||||
char* extract_d_tag(cJSON* tags) {
|
||||
if (!tags || !cJSON_IsArray(tags)) {
|
||||
return NULL;
|
||||
}
|
||||
|
||||
cJSON* tag;
|
||||
cJSON_ArrayForEach(tag, tags) {
|
||||
if (cJSON_IsArray(tag) && cJSON_GetArraySize(tag) >= 2) {
|
||||
cJSON* tag_name = cJSON_GetArrayItem(tag, 0);
|
||||
cJSON* tag_value = cJSON_GetArrayItem(tag, 1);
|
||||
|
||||
if (cJSON_IsString(tag_name) && cJSON_IsString(tag_value)) {
|
||||
if (strcmp(cJSON_GetStringValue(tag_name), "d") == 0) {
|
||||
return strdup(cJSON_GetStringValue(tag_value));
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return strdup(""); // Default empty d-tag
|
||||
}
|
||||
```
|
||||
|
||||
## Advantages of This Design
|
||||
|
||||
### 1. Protocol Compliance
|
||||
- **Enforced at DB level**: Schema constraints prevent invalid event storage
|
||||
- **Automatic replacement**: Triggers handle replaceable/addressable event logic
|
||||
- **Type safety**: Separate tables ensure correct handling per event type
|
||||
|
||||
### 2. Performance Benefits
|
||||
- **Targeted indexes**: Each table optimized for its access patterns
|
||||
- **Reduced storage**: Ephemeral events can be auto-expired
|
||||
- **Query optimization**: SQLite can optimize queries per table structure
|
||||
|
||||
### 3. JSON Tag Benefits
|
||||
- **Atomic storage**: Tags stored with their event
|
||||
- **Rich querying**: SQLite JSON functions enable complex tag queries
|
||||
- **Schema flexibility**: Can handle arbitrary tag structures
|
||||
- **Functional indexes**: Index specific tag patterns efficiently
|
||||
|
||||
## Migration Strategy
|
||||
|
||||
1. **Phase 1**: Create new schema alongside existing
|
||||
2. **Phase 2**: Implement kind classification and routing logic
|
||||
3. **Phase 3**: Migrate existing data to appropriate tables
|
||||
4. **Phase 4**: Update application logic to use new tables
|
||||
5. **Phase 5**: Drop old schema after verification
|
||||
|
||||
## Next Steps for Implementation
|
||||
|
||||
1. **Prototype JSON performance**: Create test database with sample data
|
||||
2. **Benchmark query patterns**: Compare JSON vs normalized approaches
|
||||
3. **Implement kind classification**: Add routing logic to C code
|
||||
4. **Create migration scripts**: Handle existing data transformation
|
||||
5. **Update test suite**: Verify compliance with new schema
|
||||
Reference in New Issue
Block a user