Data Type Selection Guide

This guide helps you choose the right data type for your specific use case in Grapa.:

Quick Decision Tree

Do you need data to persist after program restart?
├─ YES → Use $file() (persistent storage)
└─ NO → Continue below

Do you need mathematical operations, statistics, or linear algebra?
├─ YES → Use $VECTOR (mathematical operations)
└─ NO → Continue below

Is your dataset large (> 100-500 items) and memory efficiency matters?
├─ YES → Use {}.table() (in-memory BTree)
└─ NO → Use {} (linked list)

Detailed Comparison

Feature	`{}` (Linked List)	`{}.table()` (BTree)	`$VECTOR` (Mathematical)	`$file()` (Persistent)
Storage	In-memory	In-memory	In-memory	Disk-based
Persistence	No	No	No	Yes
Performance (small data)	Very fast	Slower	Very fast	Slowest (disk I/O)
Performance (large data)	Slower	Fast	Fast	Moderate (disk I/O)
Memory efficiency	Lower	Higher	High	Unlimited
Mathematical operations	No	No	Excellent	No
Statistical functions	No	No	Excellent	No
Linear algebra	No	No	Excellent	No
Range queries	No	Yes	No	Yes
Complex queries	No	Yes	No	Yes
Frequent modifications	Excellent	Good	Good	Poor
Data size limit	Memory limit	Memory limit	Memory limit	Disk space limit

Use Case Examples

Configuration Data

/* Use {} for configuration */
config = {
    "debug_mode": true,
    "max_connections": 100,
    "timeout": 30
};

Why: Small dataset, frequent access, no persistence needed.

Mathematical Data Processing

/* Use $VECTOR for mathematical operations */
data = [1, 2, 3, 4, 5].vector();

/* Statistical analysis */
mean = data.mean();        /* 3 */
std_dev = data.std();      /* 1.58 */
sum = data.sum();          /* 15 */

/* Element-wise operations */
doubled = data * 2;        /* #[2, 4, 6, 8, 10]# */
squared = data ** 2;       /* #[1, 4, 9, 16, 25]# */

/* Matrix operations */
matrix = [[1, 2], [3, 4]].vector();
determinant = matrix.det(); /* -2 */
inverse = matrix.inv();     /* Matrix inverse */

Why: Need mathematical operations, statistics, or linear algebra.

User Session Data

/* Use {} for session data */
session = {
    "user_id": 12345,
    "login_time": $TIME().utc(),
    "permissions": ["read", "write"]
};

Why: Small dataset, frequent modifications, temporary data.

Large Dataset with Range Queries

/* Use {}.table() for large datasets */
data = {}.table();
data.mkfield("timestamp", "TIME", "FIX", 8);
data.mkfield("value", "FLOAT", "FIX", 8);

/* Efficient range queries */
for record in data.range("timestamp", start_time, end_time) {
    /* Process records in time range */
};

Why: Large dataset, need range queries, memory efficiency important.

Persistent Log Data

/* Use $file() for persistent storage */
f = $file();
f.mk("logs", "ROW");
f.cd("logs");
f.mkfield("timestamp", "TIME", "FIX", 8);
f.mkfield("level", "STR", "VAR");
f.mkfield("message", "STR", "VAR");

/* Data persists after program restart */
f.setfield("log_001", "timestamp", $TIME().utc());
f.setfield("log_001", "level", "INFO");
f.setfield("log_001", "message", "Application started");

Why: Need persistence, large dataset, long-term storage.

Cache with Expiration

/* Use {} for cache */
cache = {};

/* Simple cache operations */
cache.set("user_123", {
    "name": "Alice",
    "email": "alice@example.com",
    "cached_at": $TIME().utc()
});

/* Check if cached data is still valid */
if (cache.get("user_123").cached_at.ms() < 300000) { /* 5 minutes */
    /* Use cached data */
} else {
    /* Refresh cache */
};

Why: Small to medium dataset, frequent access, temporary data.

Performance Guidelines

Dataset Size Thresholds

< 100 items: Always use {} (linked list)
100-500 items: Use {} unless you need range queries, memory efficiency, or mathematical operations
> 500 items: Consider {}.table() for better memory efficiency, or $VECTOR for mathematical operations
Very large datasets: Use $file() for persistence and unlimited size

Operation Patterns

Frequent insertions/deletions: Prefer {} (linked list)
Mathematical operations: Use $VECTOR (statistics, linear algebra, element-wise operations)
Range queries: Use {}.table() or $file()
Complex queries: Use {}.table() or $file()
Simple key-value access: Use {} (linked list)

Memory Considerations

Memory constrained: Use {}.table() for large datasets
Memory abundant: Use {} for simplicity
Mathematical processing: Use $VECTOR for optimized mathematical operations
Unlimited data: Use $file() for persistence

Common Mistakes

❌ Don't Do This

/* Don't use {}.table() for small datasets */
small_config = {}.table();
small_config.set("debug", true);
small_config.set("timeout", 30);

Why: Overhead of BTree structure makes it slower than {} for small datasets.

❌ Don't Do This

/* Don't use {} for mathematical operations */
data = [1, 2, 3, 4, 5];
mean = data.mean();  /* This won't work - {} doesn't have .mean() */
doubled = data * 2;  /* This won't work - {} doesn't support element-wise operations */

Why: {} doesn't support mathematical operations, statistics, or linear algebra.

❌ Don't Do This

/* Don't use {} for very large datasets */
large_dataset = {};
for i in (10000).range() {
    large_dataset.set("item_" + i.toString(), generate_data(i));
};

Why: Memory inefficiency and slower access for large datasets.

✅ Do This Instead

/* Use {} for small datasets */
small_config = {
    "debug": true,
    "timeout": 30
};

/* Use $VECTOR for mathematical operations */
data = [1, 2, 3, 4, 5].vector();
mean = data.mean();        /* 3 */
doubled = data * 2;        /* #[2, 4, 6, 8, 10]# */

/* Use {}.table() for large datasets */
large_dataset = {}.table();
for i in (10000).range() {
    large_dataset.set("item_" + i.toString(), generate_data(i));
};

Summary

{} (Linked List): Fast, simple, best for small to medium datasets
{}.table() (BTree): Memory efficient, supports complex queries, best for large datasets
$VECTOR (Mathematical): Optimized for mathematical operations, statistics, and linear algebra
$file() (Persistent): Disk-based, unlimited size, best for persistent storage

Key Insights: - The double-linked list implementation in {} is highly optimized. Don't add indexing - it would likely make things slower, not faster. - Use $VECTOR when you need mathematical operations, statistical analysis, or linear algebra - it's specifically optimized for these use cases.