This guide tours the advanced features of putior that go beyond basic annotation extraction. Learn how to auto-detect workflows, create interactive diagrams, customize detection patterns, and more.
Feature Overview
| Feature | Purpose | Key Functions |
|---|---|---|
| Auto-Annotation | Detect workflows without writing annotations |
put_auto(), put_generate(),
put_merge()
|
| Interactive Diagrams | Clickable nodes, source info display |
enable_clicks, show_source_info
|
| Detection Patterns | View/customize what gets detected | get_detection_patterns() |
| Interactive Sandbox | Experiment without writing files | run_sandbox() |
| Structured Logging | Debug annotation parsing | set_putior_log_level() |
| Themes & Styling | Customize diagram appearance |
theme, style_nodes
|
| File Exclusion | Skip files by regex pattern |
exclude parameter |
| Custom Themes | Create your own color palettes |
put_theme(), palette parameter |
| Performance | Optimize for large codebases |
pattern, recursive, validate,
exclude
|
Auto-Annotation System
The auto-annotation system analyzes your code to detect workflow elements automatically, similar to how roxygen2 generates documentation skeletons.
Why Auto-Annotation?
- Instant visualization: See data flow in unfamiliar codebases immediately
- Annotation templates: Generate starting points for manual annotations
- Hybrid workflows: Combine manual control with auto-detection for completeness
- Project onboarding: Help new team members understand code quickly
put_auto() - Detect Workflow Automatically
Analyzes source code patterns to detect file inputs, outputs, and dependencies without requiring any annotations. (Full API docs)
library(putior)
# Auto-detect workflow from code patterns
workflow <- put_auto("./src/")
# View what was detected
print(workflow)
# Generate diagram
put_diagram(workflow)Example Auto-Detection Result:
flowchart TD
load_data_R_1["load_data.R"]
process_R_1["process.R"]
report_R_1["report.R"]
%% Connections
load_data_R_1 --> process_R_1
process_R_1 --> report_R_1
%% Styling
classDef processStyle fill:#ede9fe,stroke:#7c3aed,stroke-width:2px,color:#5b21b6
class load_data_R_1 processStyle
class process_R_1 processStyle
class report_R_1 processStyle
Note: Auto-detected labels default to file names. Use
put_generate() to create annotation templates with better
labels.
What Gets Detected:
For R code: - Inputs: read.csv(),
read_csv(), readRDS(), load(),
fread(), read_excel(),
fromJSON(), read_parquet(), database
connections, etc. - Outputs: write.csv(),
saveRDS(), ggsave(),
write_parquet(), database writes, etc. -
Dependencies: source(),
sys.source()
For Python code: - Inputs:
pd.read_csv(), json.load(),
pickle.load(), database connections, etc. -
Outputs: df.to_csv(),
json.dump(), plt.savefig(), database writes,
etc.
Control Detection:
# Only detect inputs and outputs (skip dependencies)
workflow <- put_auto("./src/", detect_dependencies = FALSE)
# Only detect outputs
workflow <- put_auto("./src/", detect_inputs = FALSE, detect_dependencies = FALSE)When Things Go Wrong with Auto-Detection:
Problem: Detects wrong files or patterns - Use
put_merge()withmerge_strategy = "manual_priority"to override with manual annotations - Exclude files with specificpatternargument:put_auto("./src/", pattern = "^(?!test_).*\\.R$")Problem: Misses important file I/O - Check if your library is supported:
grep("yourfunc", sapply(get_detection_patterns("r")$input, \[[`, “func”))` - Add manual annotations for unsupported patterns - Request new patterns via GitHub issueProblem: Too much noise in results - Use
detect_dependencies = FALSEto skipsource()detection - Filter results:workflow[workflow$node_type != "dependency", ]- Switch to manual annotations for precise control
put_generate() - Generate Annotation Comments
Creates PUT annotation comments that you can add to your source files. Think of it like roxygen2’s skeleton generation. (Full API docs)
# Print suggested annotations to console
put_generate("./src/")Example Output:
# For file: process_data.R
# put id:"process_data", label:"Process Data", node_type:"process", input:"raw_data.csv", output:"clean_data.csv"
# For file: analyze.R
# put id:"analyze", label:"Analyze", node_type:"process", input:"clean_data.csv", output:"results.json"
Copy to Clipboard:
# Copy annotations to clipboard for pasting
put_generate("./src/", output = "clipboard")Annotation Styles:
# Single-line style (default)
put_generate("./src/", style = "single")
# Output: # put id:"step1", label:"Step 1", input:"a.csv", output:"b.csv"
# Multiline style for complex annotations
put_generate("./src/", style = "multiline")
# Output:
# # put id:"step1", \
# # label:"Step 1", \
# # input:"a.csv", \
# # output:"b.csv"
put_merge() - Combine Manual + Auto
Combines your manual annotations with auto-detected ones using configurable merge strategies. (Full API docs)
# Manual annotations take priority
workflow <- put_merge("./src/", merge_strategy = "manual_priority")
# Auto fills in missing input/output fields
workflow <- put_merge("./src/", merge_strategy = "supplement")
# Combine all I/O from both sources
workflow <- put_merge("./src/", merge_strategy = "union")When to Use Each Strategy:
| Strategy | Use Case |
|---|---|
manual_priority |
You want full control, auto only adds missing files |
supplement |
Your annotations have labels but missing I/O details |
union |
You want the most complete picture possible |
Before/After Merge Example:
Manual annotations only (sparse):
flowchart TD
extract(["Extract Data"])
load[["Load to DB"]]
%% Styling
classDef inputStyle fill:#dbeafe,stroke:#2563eb,stroke-width:2px,color:#1e40af
class extract inputStyle
classDef outputStyle fill:#dcfce7,stroke:#16a34a,stroke-width:2px,color:#15803d
class load outputStyle
After merge with supplement strategy (auto-detected
I/O added):
flowchart TD
extract(["Extract Data"])
transform["etl.R"]
load[["Load to DB"]]
%% Connections
extract --> transform
transform --> load
%% Styling
classDef inputStyle fill:#dbeafe,stroke:#2563eb,stroke-width:2px,color:#1e40af
class extract inputStyle
classDef processStyle fill:#ede9fe,stroke:#7c3aed,stroke-width:2px,color:#5b21b6
class transform processStyle
classDef outputStyle fill:#dcfce7,stroke:#16a34a,stroke-width:2px,color:#15803d
class load outputStyle
Auto-Annotation Workflow
Source Files ──┬──> put() ──> Manual Annotations ─┬─> put_merge() ──> put_diagram()
│ │
└──> put_auto() ──> Auto Annotations ──┘
Typical Usage Pattern:
# 1. Start with auto-detection to understand code
auto <- put_auto("./new_project/")
put_diagram(auto)
# 2. Generate annotation templates
put_generate("./new_project/", output = "clipboard")
# Paste into files and customize
# 3. Use merged workflow for complete picture
final <- put_merge("./new_project/", merge_strategy = "supplement")
put_diagram(final, output = "file", file = "workflow.md")Interactive Diagrams
Make your diagrams more useful with source file information and clickable nodes.
show_source_info - Display File Information
Show which source file each workflow node comes from. (API Reference)
workflow <- put("./src/", include_line_numbers = TRUE)
# Inline style - shows file name below node label
put_diagram(workflow, show_source_info = TRUE)Output:
flowchart TD
load(["Load Data"<br/>(01_load.R)])
process["Process"<br/>(02_process.R)]
%% Connections
load --> process
%% Styling
classDef inputStyle fill:#dbeafe,stroke:#2563eb,stroke-width:2px,color:#1e40af
class load inputStyle
classDef processStyle fill:#ede9fe,stroke:#7c3aed,stroke-width:2px,color:#5b21b6
class process processStyle
Subgraph Style:
Group nodes by source file using subgraphs:
put_diagram(workflow,
show_source_info = TRUE,
source_info_style = "subgraph")Output:
flowchart TD
subgraph 01_load ["01_load.R"]
load(["Load Data"])
end
subgraph 02_process ["02_process.R"]
process["Process"]
end
%% Connections
load --> process
%% Styling
classDef inputStyle fill:#dbeafe,stroke:#2563eb,stroke-width:2px,color:#1e40af
class load inputStyle
classDef processStyle fill:#ede9fe,stroke:#7c3aed,stroke-width:2px,color:#5b21b6
class process processStyle
enable_clicks - Clickable Nodes
Make diagram nodes clickable to open the source file directly in your editor. (API Reference)
workflow <- put("./src/", include_line_numbers = TRUE)
# Enable clicks with VS Code protocol
put_diagram(workflow, enable_clicks = TRUE)
# Use RStudio protocol
put_diagram(workflow, enable_clicks = TRUE, click_protocol = "rstudio")
# Use standard file:// protocol
put_diagram(workflow, enable_clicks = TRUE, click_protocol = "file")Supported Protocols:
| Protocol | URL Format | Use With |
|---|---|---|
vscode |
vscode://file/path:line |
VS Code, Cursor |
rstudio |
rstudio://open-file?path= |
RStudio IDE |
file |
file:///path |
System default |
Combined Interactive Features:
put_diagram(workflow,
show_source_info = TRUE, # Show file names
source_info_style = "inline", # Inline display
enable_clicks = TRUE, # Make clickable
click_protocol = "vscode") # Open in VS CodeDetection Patterns
View and understand the patterns putior uses to auto-detect inputs and outputs.
get_detection_patterns() - View Patterns
# Get all R patterns
r_patterns <- get_detection_patterns("r")
names(r_patterns)
#> [1] "input" "output" "dependency"
# Get only input patterns for R
input_patterns <- get_detection_patterns("r", type = "input")
length(input_patterns)
#> [1] 58 # R has 58+ input patterns!
# View a specific pattern
input_patterns[[1]]
#> $regex
#> [1] "read\\.csv\\s*\\("
#> $func
#> [1] "read.csv"
#> $arg_position
#> [1] 1
#> $arg_name
#> [1] "file"
#> $description
#> [1] "Base R CSV reader"Supported Languages
# All languages with annotation support
list_supported_languages()
#> [1] "r" "python" "shell" "julia" "ruby"
#> [6] "perl" "yaml" "toml" "sql" "lua"
#> [11] "haskell" "javascript" "typescript" "c" "cpp"
#> [16] "java" "go" "rust" "swift" "kotlin"
#> [21] "csharp" "php" "scala" "matlab" "latex"
# Languages with auto-detection patterns (18 languages, 900+ patterns)
list_supported_languages(detection_only = TRUE)
#> [1] "r" "python" "sql" "shell" "julia"
#> [6] "javascript" "typescript" "go" "rust" "java"
#> [11] "c" "cpp" "matlab" "ruby" "lua"
#> [16] "wgsl" "dockerfile" "makefile"
# Get comment prefix for any extension
get_comment_prefix("sql")
#> [1] "--"
get_comment_prefix("js")
#> [1] "//"
get_comment_prefix("m")
#> [1] "%"Pattern Categories
R Patterns Include:
| Category | Examples |
|---|---|
| Base R |
read.csv, write.csv, saveRDS,
load
|
| tidyverse |
read_csv, write_csv,
read_rds
|
| data.table |
fread, fwrite
|
| Excel |
read_excel, read_xlsx,
write_xlsx
|
| JSON |
fromJSON, toJSON,
read_json
|
| Parquet/Arrow |
read_parquet, write_parquet
|
| Database |
dbConnect, dbReadTable,
dbWriteTable
|
| Graphics |
ggsave, pdf, png,
jpeg
|
| Statistical |
read_sav, read_sas,
read_dta
|
Python Patterns Include:
| Category | Examples |
|---|---|
| pandas |
pd.read_csv, .to_csv,
.to_parquet
|
| Built-in |
open(), json.load,
pickle.load
|
| numpy |
np.load, np.save,
np.savetxt
|
| matplotlib | plt.savefig |
| polars |
pl.read_csv, .write_csv
|
| Database |
create_engine, cursor.execute
|
JavaScript/TypeScript Patterns Include:
| Category | Examples |
|---|---|
| Node.js fs |
fs.readFile, fs.writeFile,
fs.createReadStream
|
| HTTP |
fetch, axios.get, got
|
| Database |
mongoose.connect, knex,
prisma
|
| Express.js |
req.body, res.json,
res.sendFile
|
| Modules |
require, import, export
|
Go Patterns Include:
| Category | Examples |
|---|---|
| os/io |
os.Open, os.ReadFile,
os.Create
|
| bufio |
bufio.NewReader, bufio.NewScanner
|
| encoding |
json.NewDecoder, csv.NewReader
|
| database/sql |
sql.Open, db.Query,
db.Exec
|
| gorm |
gorm.Open, db.Find,
db.Create
|
Java Patterns Include:
| Category | Examples |
|---|---|
| Classic I/O |
FileInputStream, BufferedReader
|
| NIO |
Files.readAllLines, Files.write
|
| JDBC |
DriverManager.getConnection,
executeQuery
|
| Jackson |
objectMapper.readValue,
objectMapper.writeValue
|
| Spring Boot |
@RequestBody, ResponseEntity,
repository.save
|
Rust Patterns Include:
| Category | Examples |
|---|---|
| std::fs |
File::open, fs::read_to_string,
fs::write
|
| serde |
serde_json::from_reader,
serde_json::to_writer
|
| csv |
csv::Reader::from_path,
csv::Writer::from_path
|
| sqlx |
sqlx::connect, sqlx::query
|
| reqwest |
reqwest::get, Client::new
|
Interactive Sandbox
The sandbox is a Shiny app for experimenting with PUT annotations without creating files.
Sandbox Features
Code Editor: Paste or type annotated code with syntax highlighting (requires
shinyAce)Multi-file Simulation: Use special markers to simulate multiple files:
# ===== File: 01_load.R =====
# put label:"Load Data", node_type:"input", output:"data.csv"
data <- read.csv("source.csv")
# ===== File: 02_process.R =====
# put label:"Process", input:"data.csv", output:"results.csv"
# Processing code hereReal-time Preview: See diagram updates as you edit
-
Customization Options:
- Theme selection (github, light, dark, etc.)
- Direction (TD, LR, BT, RL)
- Show/hide artifacts
- Show/hide file names
- Workflow boundaries toggle
-
Export Options:
- Download as Markdown file
- Copy Mermaid code to clipboard
- View extracted workflow data
Debugging with Logging
putior includes optional structured logging via the
logger package.
Enable Logging
# Install logger if needed
install.packages("logger")
# Set log level
set_putior_log_level("DEBUG")
# Now all putior functions will log detailed information
workflow <- put("./src/")Log Levels
| Level | What You See |
|---|---|
DEBUG |
Every operation: file scans, pattern matches, parsing steps |
INFO |
Progress milestones: scan started, nodes found, diagram complete |
WARN |
Issues that don’t stop execution: validation warnings |
ERROR |
Fatal issues only |
Per-Call Override
# Override for a single call without changing global setting
workflow <- put("./src/", log_level = "DEBUG")
put_diagram(workflow, log_level = "INFO")Debugging Scenarios
Why isn’t my annotation found?
set_putior_log_level("DEBUG")
workflow <- put("./problem_file.R", include_line_numbers = TRUE)
# Check logs for pattern matching detailsWhy are nodes not connected?
set_putior_log_level("INFO")
put_diagram(workflow, show_artifacts = TRUE)
# Logs show connection logicThemes and Customization
Available Themes
get_diagram_themes()
#> $light
#> [1] "Default light theme with bright colors - perfect for documentation sites"
#>
#> $dark
#> [1] "Dark theme with muted colors - ideal for dark mode environments and terminals"
#> ...Standard Themes
| Theme | Best For | Colors |
|---|---|---|
github |
GitHub README | Light backgrounds, pastel nodes |
light |
Light documentation | Bright, vibrant colors |
dark |
Dark mode apps | Muted colors on dark |
auto |
Adaptive sites | Works in both modes |
minimal |
Reports, printing | Professional grayscale |
Colorblind-Safe Themes (Viridis Family)
These themes are perceptually uniform and tested for accessibility with color vision deficiencies (deuteranopia, protanopia, tritanopia).
| Theme | Best For | Palette |
|---|---|---|
viridis |
General use | Purple -> Blue -> Green -> Yellow |
magma |
Print, high contrast | Purple -> Red -> Yellow |
plasma |
Presentations | Purple -> Pink -> Orange -> Yellow |
cividis |
Maximum accessibility | Blue -> Gray -> Yellow (red-green safe) |
# Use colorblind-safe theme
workflow <- put("./src/")
put_diagram(workflow, theme = "viridis")
# Cividis is optimized for red-green color blindness
put_diagram(workflow, theme = "cividis")Custom Palettes with put_theme()
Create your own color palette by overriding specific node types from any base theme:
# Create a custom palette
my_palette <- put_theme(
base = "dark",
input = c(fill = "#1a5276", stroke = "#2e86c1", color = "#ffffff"),
process = c(fill = "#1e8449", stroke = "#27ae60", color = "#ffffff"),
output = c(fill = "#922b21", stroke = "#e74c3c", color = "#ffffff")
)
# Apply with the palette parameter (overrides theme)
workflow <- put("./src/")
put_diagram(workflow, palette = my_palette)Only specify the node types you want to change — the rest inherit from the base theme.
Theme Examples
workflow <- put("./src/")
# GitHub (recommended for README)
put_diagram(workflow, theme = "github")
# Dark mode
put_diagram(workflow, theme = "dark", direction = "LR")
# Minimal for reports
put_diagram(workflow, theme = "minimal", output = "file", file = "report.md")Styling Options
put_diagram(workflow,
theme = "github", # Color theme
direction = "TD", # Flow direction
style_nodes = TRUE, # Apply colors
show_workflow_boundaries = TRUE, # Special start/end styling
node_labels = "label" # Label style: "name", "label", "both"
)Direction Options
| Direction | Description | Best For |
|---|---|---|
TD |
Top to Down | Deep pipelines |
LR |
Left to Right | Wide workflows |
BT |
Bottom to Top | Unusual layouts |
RL |
Right to Left | RTL languages |
When Things Go Wrong with Diagrams:
Problem: Diagram doesn’t render (shows raw text) - Test your Mermaid code at mermaid.live to identify syntax issues - For pkgdown sites, ensure Mermaid.js is included in
_pkgdown.yml- Useoutput = "raw"and render manually with Mermaid CLI:mmdc -i diagram.mmd -o diagram.svgProblem: Nodes overlap or layout looks wrong - Try different directions:
direction = "LR"often works better for wide workflows - Split large workflows into subgraphs:show_source_info = TRUE, source_info_style = "subgraph"- Use explicit IDs to control node ordering (Mermaid renders in ID order)Problem: Too many nodes, diagram is unreadable - Hide data file nodes:
show_artifacts = FALSE- Filter workflow before rendering:workflow[workflow$file_name != "test.R", ]- Split into multiple diagrams by directory or stageProblem: Need to manually edit the Mermaid output - Use
output = "raw"to get editable Mermaid code - Save to file:put_diagram(workflow, output = "file", file = "workflow.mmd")- Edit the.mmdfile and render with your preferred tool
Performance
putior is designed to handle codebases of all sizes efficiently. Understanding its performance characteristics helps you optimize for large projects.
Time Complexity
Annotation parsing operates in O(n) time where n is the total number of lines across all scanned files:
| Operation | Complexity | Notes |
|---|---|---|
| File scanning | O(files) | Directory traversal |
| Line parsing | O(lines) | Single pass per file |
| Pattern matching | O(lines x patterns) | Regex matching per line |
| Diagram generation | O(nodes + edges) | Graph construction |
The dominant factor is total lines scanned. For most codebases, parsing completes in milliseconds to seconds.
Memory Usage
putior processes files sequentially and stores only:
- Extracted annotations (typically small)
- Node and edge data for diagram generation
- File metadata when
include_line_numbers = TRUE
Memory usage scales with the number of annotations found, not the total codebase size. A 100,000-line codebase with 50 annotations uses similar memory to a 1,000-line codebase with 50 annotations.
Performance Benchmarks
Expected processing times on typical hardware (results may vary):
| Codebase Size | Files | Lines | Approximate Time |
|---|---|---|---|
| Small project | 10-50 | 1,000-5,000 | < 100ms |
| Medium project | 50-200 | 5,000-50,000 | 100-500ms |
| Large project | 200-1,000 | 50,000-500,000 | 0.5-3s |
| Monorepo | 1,000+ | 500,000+ | 3-10s |
Auto-detection (put_auto()) is slower than manual
annotation extraction (put()) due to additional pattern
matching.
Tips for Large Codebases
1. Use specific file patterns instead of scanning everything:
# Slow: scans all files recursively
workflow <- put("./src/")
# Faster: only scan R files
workflow <- put("./src/", pattern = "\\.R$")
# Fastest: scan specific directories
workflow <- put(c("./src/etl/", "./src/analysis/"))2. Exclude files with regex patterns:
# Skip test files
workflow <- put("./src/", exclude = "test")
# Skip multiple patterns
workflow <- put_auto("./project/", exclude = c("test", "vendor", "\\.min\\.js$"))
# All four scan functions support exclude
put_generate("./src/", exclude = "fixture")
put_merge("./src/", exclude = c("mock", "snapshot"))3. Disable validation for performance-critical scripts:
# Skip validation checks for faster processing
workflow <- put("./src/", validate = FALSE)4. Use recursive = FALSE to limit scope when
appropriate:
# Only scan top-level directory (recursive is TRUE by default)
workflow <- put("./src/", recursive = FALSE)
# Or scan a specific subdirectory
workflow2 <- put("./src/important_module/")5. Consider splitting large directories:
# Process in chunks for very large projects
etl_workflow <- put("./src/etl/")
analysis_workflow <- put("./src/analysis/")
reporting_workflow <- put("./src/reporting/")
# Combine if needed
# all_workflows <- rbind(etl_workflow, analysis_workflow, reporting_workflow)6. Cache results for repeated use:
# Save workflow for reuse
workflow <- put("./src/")
saveRDS(workflow, "workflow_cache.rds")
# Load cached workflow (instant)
workflow <- readRDS("workflow_cache.rds")
put_diagram(workflow)7. Profile before optimizing:
# Measure actual time
system.time({
workflow <- put("./src/", recursive = TRUE)
})
# For detailed profiling
if (requireNamespace("profvis", quietly = TRUE)) {
profvis::profvis({
workflow <- put("./src/", recursive = TRUE)
})
}Putting It All Together
Complete Interactive Documentation Workflow
library(putior)
# 1. Enable logging for debugging
set_putior_log_level("INFO")
# 2. Extract with line numbers for clickable links
workflow <- put("./src/",
recursive = TRUE,
include_line_numbers = TRUE)
# 3. Merge with auto-detection for completeness
complete_workflow <- put_merge("./src/",
recursive = TRUE,
merge_strategy = "supplement")
# 4. Generate interactive diagram
put_diagram(complete_workflow,
theme = "github",
direction = "TD",
show_artifacts = TRUE,
show_source_info = TRUE,
source_info_style = "subgraph",
enable_clicks = TRUE,
click_protocol = "vscode",
title = "Data Pipeline",
output = "file",
file = "docs/workflow.md"
)
# 5. Return to normal logging
set_putior_log_level("WARN")Quick Visualization of Unknown Code
# Instantly understand a new codebase
workflow <- put_auto("./unfamiliar_project/", recursive = TRUE)
put_diagram(workflow, show_artifacts = TRUE)
# Generate annotation suggestions
put_generate("./unfamiliar_project/", output = "clipboard")See Also
- Quick Start - First diagram in 2 minutes
- Annotation Guide - Complete syntax reference
- Quick Reference - Cheat sheet for daily use
- API Reference - Complete function documentation
- Showcase - Real-world examples
Try the Examples
# Auto-annotation example
source(system.file("examples", "auto-annotation-example.R", package = "putior"))
# Interactive diagrams example
source(system.file("examples", "interactive-diagrams-example.R", package = "putior"))
# Variable reference example
source(system.file("examples", "variable-reference-example.R", package = "putior"))
# Launch sandbox
run_sandbox()