This showcase demonstrates putior diagrams at different scales, from simple workflows to complex multi-file pipelines.
Small Workflows (3-5 nodes)
Perfect for single-purpose scripts or focused analysis tasks.
Example: Simple ETL Pipeline
A basic extract-transform-load workflow:
# 01_extract.R
# put label:"Extract Data", node_type:"input", output:"raw_data.csv"
# 02_transform.R
# put label:"Transform Data", input:"raw_data.csv", output:"clean_data.csv"
# 03_load.R
# put label:"Load to Database", node_type:"output", input:"clean_data.csv"Generated Diagram:
flowchart TD
extract(["Extract Data"])
transform["Transform Data"]
load[["Load to Database"]]
%% Connections
extract --> transform
transform --> load
%% Styling
classDef inputStyle fill:#dbeafe,stroke:#2563eb,stroke-width:2px,color:#1e40af
class extract inputStyle
classDef processStyle fill:#ede9fe,stroke:#7c3aed,stroke-width:2px,color:#5b21b6
class transform processStyle
classDef outputStyle fill:#dcfce7,stroke:#16a34a,stroke-width:2px,color:#15803d
class load outputStyle
Example: Report Generation
A simple report generation workflow:
# fetch_metrics.R
# put label:"Fetch Metrics", node_type:"input", output:"metrics.json"
# analyze.R
# put label:"Analyze Trends", input:"metrics.json", output:"analysis.rds"
# report.R
# put label:"Generate Report", node_type:"output", input:"analysis.rds", output:"report.html"Generated Diagram:
flowchart TD
fetch(["Fetch Metrics"])
analyze["Analyze Trends"]
report[["Generate Report"]]
%% Connections
fetch --> analyze
analyze --> report
%% Styling
classDef inputStyle fill:#dbeafe,stroke:#2563eb,stroke-width:2px,color:#1e40af
class fetch inputStyle
classDef processStyle fill:#ede9fe,stroke:#7c3aed,stroke-width:2px,color:#5b21b6
class analyze processStyle
classDef outputStyle fill:#dcfce7,stroke:#16a34a,stroke-width:2px,color:#15803d
class report outputStyle
Medium Workflows (10-15 nodes)
Suitable for typical data science projects with multiple processing stages.
Example: Machine Learning Pipeline
A complete ML workflow from data collection to model deployment:
# 01_collect_data.py
# put label:"Collect Raw Data", node_type:"input", output:"raw_data.csv"
# 02_clean_data.R
# put label:"Clean Data", input:"raw_data.csv", output:"clean_data.csv"
# 03_feature_eng.R
# put label:"Feature Engineering", input:"clean_data.csv", output:"features.csv"
# 04_split_data.R
# put label:"Train/Test Split", input:"features.csv", output:"train.csv, test.csv"
# 05_train_model.py
# put label:"Train Model", input:"train.csv", output:"model.pkl"
# 06_evaluate.py
# put label:"Evaluate Model", input:"model.pkl, test.csv", output:"metrics.json"
# 07_hyperparameter.py
# put label:"Hyperparameter Tuning", input:"train.csv", output:"best_params.json"
# 08_retrain.py
# put label:"Retrain with Best Params", input:"train.csv, best_params.json", output:"final_model.pkl"
# 09_validate.R
# put label:"Final Validation", input:"final_model.pkl, test.csv", output:"validation_report.html"
# 10_deploy.sh
# put label:"Deploy Model", node_type:"output", input:"final_model.pkl, validation_report.html"Generated Diagram:
flowchart TD
collect(["Collect Raw Data"])
clean["Clean Data"]
feature["Feature Engineering"]
split["Train/Test Split"]
train["Train Model"]
evaluate["Evaluate Model"]
hyper["Hyperparameter Tuning"]
retrain["Retrain with Best Params"]
validate["Final Validation"]
deploy[["Deploy Model"]]
%% Connections
collect --> clean
clean --> feature
feature --> split
split --> train
train --> evaluate
split --> evaluate
split --> hyper
split --> retrain
hyper --> retrain
retrain --> validate
split --> validate
retrain --> deploy
validate --> deploy
%% Styling
classDef inputStyle fill:#dbeafe,stroke:#2563eb,stroke-width:2px,color:#1e40af
class collect inputStyle
classDef processStyle fill:#ede9fe,stroke:#7c3aed,stroke-width:2px,color:#5b21b6
class clean processStyle
class feature processStyle
class split processStyle
class train processStyle
class evaluate processStyle
class hyper processStyle
class retrain processStyle
class validate processStyle
classDef outputStyle fill:#dcfce7,stroke:#16a34a,stroke-width:2px,color:#15803d
class deploy outputStyle
Example: Multi-Source Data Integration
Combining data from multiple sources:
# sources/fetch_sales.R
# put label:"Fetch Sales API", node_type:"input", output:"sales_raw.json"
# sources/fetch_inventory.R
# put label:"Fetch Inventory DB", node_type:"input", output:"inventory_raw.csv"
# sources/fetch_customers.py
# put label:"Fetch Customer CRM", node_type:"input", output:"customers_raw.csv"
# transform/clean_sales.R
# put label:"Clean Sales", input:"sales_raw.json", output:"sales_clean.csv"
# transform/clean_inventory.R
# put label:"Clean Inventory", input:"inventory_raw.csv", output:"inventory_clean.csv"
# transform/clean_customers.R
# put label:"Clean Customers", input:"customers_raw.csv", output:"customers_clean.csv"
# integrate/merge_data.R
# put label:"Merge All Sources", input:"sales_clean.csv, inventory_clean.csv, customers_clean.csv", output:"integrated_data.csv"
# analyze/business_metrics.R
# put label:"Calculate Metrics", input:"integrated_data.csv", output:"metrics.rds"
# report/dashboard.R
# put label:"Generate Dashboard", node_type:"output", input:"metrics.rds", output:"dashboard.html"Generated Diagram:
flowchart TD
sales_api(["Fetch Sales API"])
inv_db(["Fetch Inventory DB"])
cust_crm(["Fetch Customer CRM"])
clean_sales["Clean Sales"]
clean_inv["Clean Inventory"]
clean_cust["Clean Customers"]
merge["Merge All Sources"]
metrics["Calculate Metrics"]
dashboard[["Generate Dashboard"]]
%% Connections
sales_api --> clean_sales
inv_db --> clean_inv
cust_crm --> clean_cust
clean_sales --> merge
clean_inv --> merge
clean_cust --> merge
merge --> metrics
metrics --> dashboard
%% Styling
classDef inputStyle fill:#dbeafe,stroke:#2563eb,stroke-width:2px,color:#1e40af
class sales_api inputStyle
class inv_db inputStyle
class cust_crm inputStyle
classDef processStyle fill:#ede9fe,stroke:#7c3aed,stroke-width:2px,color:#5b21b6
class clean_sales processStyle
class clean_inv processStyle
class clean_cust processStyle
class merge processStyle
class metrics processStyle
classDef outputStyle fill:#dcfce7,stroke:#16a34a,stroke-width:2px,color:#15803d
class dashboard outputStyle
Large Workflows (20+ nodes)
For enterprise-scale data pipelines and complex analysis systems.
Example: Complete Analytics Platform
A full analytics platform with multiple parallel processing streams.
Note: This complex subgraph diagram uses advanced Mermaid
features (named subgraphs) that put_diagram() doesn’t
generate natively. For enterprise workflows with complex groupings, you
can combine putior-generated diagrams with hand-crafted Mermaid
extensions.
flowchart TD
subgraph Data_Sources [Data Sources]
web_logs([Web Logs])
app_events([App Events])
crm_data([CRM Data])
finance_data([Finance Data])
end
subgraph Ingestion [Ingestion Layer]
parse_logs[Parse Web Logs]
parse_events[Parse App Events]
extract_crm[Extract CRM]
extract_finance[Extract Finance]
end
subgraph Transformation [Transformation Layer]
clean_logs[Clean Logs]
clean_events[Clean Events]
clean_crm[Clean CRM]
clean_finance[Clean Finance]
enrich_logs[Enrich with Geo]
enrich_events[Add Session Info]
join_customer[Join Customer Data]
end
subgraph Analytics [Analytics Layer]
user_behavior[User Behavior Analysis]
conversion_funnel[Conversion Funnel]
revenue_analysis[Revenue Analysis]
cohort_analysis[Cohort Analysis]
ab_testing[A/B Test Results]
end
subgraph Output [Output Layer]
exec_dashboard[[Executive Dashboard]]
marketing_report[[Marketing Report]]
finance_report[[Finance Report]]
data_warehouse[[Data Warehouse]]
end
web_logs --> parse_logs
app_events --> parse_events
crm_data --> extract_crm
finance_data --> extract_finance
parse_logs --> clean_logs
parse_events --> clean_events
extract_crm --> clean_crm
extract_finance --> clean_finance
clean_logs --> enrich_logs
clean_events --> enrich_events
clean_crm --> join_customer
clean_finance --> revenue_analysis
enrich_logs --> user_behavior
enrich_events --> user_behavior
enrich_events --> conversion_funnel
join_customer --> cohort_analysis
join_customer --> ab_testing
user_behavior --> exec_dashboard
conversion_funnel --> marketing_report
revenue_analysis --> finance_report
cohort_analysis --> exec_dashboard
ab_testing --> marketing_report
user_behavior --> data_warehouse
revenue_analysis --> data_warehouse
cohort_analysis --> data_warehouse
classDef inputStyle fill:#dbeafe,stroke:#2563eb,stroke-width:2px,color:#1e40af
classDef processStyle fill:#ede9fe,stroke:#7c3aed,stroke-width:2px,color:#5b21b6
classDef outputStyle fill:#dcfce7,stroke:#16a34a,stroke-width:2px,color:#15803d
class web_logs,app_events,crm_data,finance_data inputStyle
class parse_logs,parse_events,extract_crm,extract_finance processStyle
class clean_logs,clean_events,clean_crm,clean_finance processStyle
class enrich_logs,enrich_events,join_customer processStyle
class user_behavior,conversion_funnel,revenue_analysis,cohort_analysis,ab_testing processStyle
class exec_dashboard,marketing_report,finance_report,data_warehouse outputStyle
Multi-Language Workflows
putior excels at documenting polyglot data pipelines with automatic comment syntax detection for 30+ languages.
Language-Specific Comment Syntax
| Comment Style | Languages | Example |
|---|---|---|
# put |
R, Python, Shell, Julia, Ruby, YAML | # put label:"Process Data" |
-- put |
SQL, Lua, Haskell | -- put label:"Query Database" |
// put |
JavaScript, TypeScript, C, Go, Rust, Java | // put label:"Transform JSON" |
% put |
MATLAB, LaTeX | % put label:"Compute Matrix" |
Example: R + Python + SQL Pipeline
Each language uses its native comment syntax:
-- extract.sql (SQL uses -- comments)
-- put label:"SQL Extract", node_type:"input", output:"raw_query_results.csv"
SELECT * FROM sales WHERE date > '2024-01-01';# transform.py (Python uses # comments)
# put label:"Python Transform", input:"raw_query_results.csv", output:"transformed.parquet"
import pandas as pd
df = pd.read_csv("raw_query_results.csv")
# analyze.R (R uses # comments)
# put label:"R Statistical Analysis", input:"transformed.parquet", output:"stats.rds"
library(arrow)
data <- read_parquet("transformed.parquet")
# visualize.R
# put label:"R Visualization", input:"stats.rds", output:"plots.pdf"
# report.py
# put label:"Python Report Gen", node_type:"output", input:"stats.rds, plots.pdf", output:"final_report.html"Generated Diagram:
flowchart TD
sql(["SQL Extract"])
python_transform["Python Transform"]
r_stats["R Statistical Analysis"]
r_viz["R Visualization"]
python_report[["Python Report Gen"]]
%% Connections
sql --> python_transform
python_transform --> r_stats
r_stats --> r_viz
r_stats --> python_report
r_viz --> python_report
%% Styling
classDef inputStyle fill:#dbeafe,stroke:#2563eb,stroke-width:2px,color:#1e40af
class sql inputStyle
classDef processStyle fill:#ede9fe,stroke:#7c3aed,stroke-width:2px,color:#5b21b6
class python_transform processStyle
class r_stats processStyle
class r_viz processStyle
classDef outputStyle fill:#dcfce7,stroke:#16a34a,stroke-width:2px,color:#15803d
class python_report outputStyle
Domain-Specific Examples
Real-world workflows from various data science domains.
Bioinformatics Pipeline
A genomics analysis workflow processing FASTA sequences:
# sequences/fetch_sequences.R
# put label:"Fetch FASTA Sequences", node_type:"input", output:"raw_sequences.fasta"
# sequences/quality_control.py
# put label:"Quality Control", input:"raw_sequences.fasta", output:"filtered_sequences.fasta, qc_report.html"
# alignment/run_blast.sh
# put label:"BLAST Alignment", input:"filtered_sequences.fasta", output:"blast_results.xml"
# alignment/parse_blast.R
# put label:"Parse BLAST Results", input:"blast_results.xml", output:"alignments.csv"
# analysis/differential_expression.R
# put label:"Differential Expression", input:"alignments.csv", output:"de_results.rds"
# analysis/pathway_analysis.R
# put label:"Pathway Enrichment", input:"de_results.rds", output:"pathways.csv"
# report/bioinformatics_report.R
# put label:"Generate Report", node_type:"output", input:"de_results.rds, pathways.csv, qc_report.html", output:"analysis_report.html"Generated Diagram:
flowchart TD
fetch(["Fetch FASTA Sequences"])
qc["Quality Control"]
blast["BLAST Alignment"]
parse["Parse BLAST Results"]
de["Differential Expression"]
pathway["Pathway Enrichment"]
report[["Generate Report"]]
%% Connections
fetch --> qc
qc --> blast
blast --> parse
parse --> de
de --> pathway
de --> report
pathway --> report
qc --> report
%% Styling
classDef inputStyle fill:#dbeafe,stroke:#2563eb,stroke-width:2px,color:#1e40af
class fetch inputStyle
classDef processStyle fill:#ede9fe,stroke:#7c3aed,stroke-width:2px,color:#5b21b6
class qc processStyle
class blast processStyle
class parse processStyle
class de processStyle
class pathway processStyle
classDef outputStyle fill:#dcfce7,stroke:#16a34a,stroke-width:2px,color:#15803d
class report outputStyle
Financial Analysis Pipeline
Portfolio analysis and risk assessment workflow:
# data/fetch_market_data.py
# put label:"Fetch Market Data", node_type:"input", output:"market_prices.parquet"
# data/fetch_holdings.R
# put label:"Load Portfolio Holdings", node_type:"input", output:"holdings.csv"
# analysis/calculate_returns.R
# put label:"Calculate Returns", input:"market_prices.parquet, holdings.csv", output:"returns.rds"
# analysis/risk_metrics.R
# put label:"Compute Risk Metrics", input:"returns.rds", output:"var_results.rds, sharpe_ratios.csv"
# analysis/attribution.py
# put label:"Performance Attribution", input:"returns.rds, holdings.csv", output:"attribution.json"
# optimization/portfolio_optimize.R
# put label:"Portfolio Optimization", input:"returns.rds, var_results.rds", output:"optimal_weights.csv"
# report/risk_dashboard.R
# put label:"Risk Dashboard", node_type:"output", input:"var_results.rds, sharpe_ratios.csv, attribution.json, optimal_weights.csv", output:"risk_report.html"Generated Diagram:
flowchart TD
market(["Fetch Market Data"])
holdings(["Load Portfolio Holdings"])
returns["Calculate Returns"]
risk["Compute Risk Metrics"]
attrib["Performance Attribution"]
optimize["Portfolio Optimization"]
dashboard[["Risk Dashboard"]]
%% Connections
market --> returns
holdings --> returns
returns --> risk
returns --> attrib
holdings --> attrib
returns --> optimize
risk --> optimize
risk --> dashboard
attrib --> dashboard
optimize --> dashboard
%% Styling
classDef inputStyle fill:#dbeafe,stroke:#2563eb,stroke-width:2px,color:#1e40af
class market inputStyle
class holdings inputStyle
classDef processStyle fill:#ede9fe,stroke:#7c3aed,stroke-width:2px,color:#5b21b6
class returns processStyle
class risk processStyle
class attrib processStyle
class optimize processStyle
classDef outputStyle fill:#dcfce7,stroke:#16a34a,stroke-width:2px,color:#15803d
class dashboard outputStyle
Web Scraping Pipeline
Data extraction from web sources:
# scrape/fetch_urls.py
# put label:"Fetch URL List", node_type:"input", output:"target_urls.txt"
# scrape/scrape_pages.py
# put label:"Scrape Web Pages", input:"target_urls.txt", output:"raw_html.json"
# extract/parse_html.py
# put label:"Parse HTML Content", input:"raw_html.json", output:"extracted_text.json"
# extract/extract_entities.py
# put label:"Named Entity Recognition", input:"extracted_text.json", output:"entities.csv"
# transform/clean_data.R
# put label:"Clean and Normalize", input:"entities.csv", output:"clean_entities.csv"
# transform/deduplicate.R
# put label:"Remove Duplicates", input:"clean_entities.csv", output:"unique_entities.csv"
# load/save_to_db.py
# put label:"Load to Database", node_type:"output", input:"unique_entities.csv"Generated Diagram:
flowchart TD
urls(["Fetch URL List"])
scrape["Scrape Web Pages"]
parse["Parse HTML Content"]
ner["Named Entity Recognition"]
clean["Clean and Normalize"]
dedup["Remove Duplicates"]
db[["Load to Database"]]
%% Connections
urls --> scrape
scrape --> parse
parse --> ner
ner --> clean
clean --> dedup
dedup --> db
%% Styling
classDef inputStyle fill:#dbeafe,stroke:#2563eb,stroke-width:2px,color:#1e40af
class urls inputStyle
classDef processStyle fill:#ede9fe,stroke:#7c3aed,stroke-width:2px,color:#5b21b6
class scrape processStyle
class parse processStyle
class ner processStyle
class clean processStyle
class dedup processStyle
classDef outputStyle fill:#dcfce7,stroke:#16a34a,stroke-width:2px,color:#15803d
class db outputStyle
Multi-Language ML Pipeline
A realistic ML workflow using R for data prep, Python for training, and R for reporting:
# data/load_raw_data.R
# put label:"Load Raw Data (R)", node_type:"input", output:"raw_data.rds"
# data/eda_analysis.R
# put label:"Exploratory Analysis (R)", input:"raw_data.rds", output:"eda_report.html, data_summary.json"
# preprocessing/feature_engineering.R
# put label:"Feature Engineering (R)", input:"raw_data.rds, data_summary.json", output:"features.parquet"
# preprocessing/split_data.R
# put label:"Train/Test Split (R)", input:"features.parquet", output:"train.parquet, test.parquet"
# training/train_model.py
# put label:"Train XGBoost (Python)", input:"train.parquet", output:"model.pkl, training_metrics.json"
# training/hyperparameter_search.py
# put label:"Hyperparameter Tuning (Python)", input:"train.parquet", output:"best_params.json"
# training/final_model.py
# put label:"Final Model Training (Python)", input:"train.parquet, best_params.json", output:"final_model.pkl"
# evaluation/model_evaluation.py
# put label:"Model Evaluation (Python)", input:"final_model.pkl, test.parquet", output:"predictions.csv, eval_metrics.json"
# reporting/model_report.R
# put label:"Model Report (R)", node_type:"output", input:"eval_metrics.json, training_metrics.json, eda_report.html", output:"final_report.html"
# deployment/export_model.py
# put label:"Export for Production (Python)", node_type:"output", input:"final_model.pkl", output:"model_artifact.tar.gz"Generated Diagram:
flowchart TD
load(["Load Raw Data - R"])
eda["Exploratory Analysis - R"]
features["Feature Engineering - R"]
split["Train/Test Split - R"]
train["Train XGBoost - Python"]
hyper["Hyperparameter Tuning - Python"]
final["Final Model Training - Python"]
evaluate["Model Evaluation - Python"]
report[["Model Report - R"]]
deploy_export[["Export for Production - Python"]]
%% Connections
load --> eda
load --> features
eda --> features
features --> split
split --> train
split --> hyper
split --> final
hyper --> final
final --> evaluate
split --> evaluate
evaluate --> report
train --> report
eda --> report
final --> deploy_export
%% Styling
classDef inputStyle fill:#dbeafe,stroke:#2563eb,stroke-width:2px,color:#1e40af
class load inputStyle
classDef processStyle fill:#ede9fe,stroke:#7c3aed,stroke-width:2px,color:#5b21b6
class eda processStyle
class features processStyle
class split processStyle
class train processStyle
class hyper processStyle
class final processStyle
class evaluate processStyle
classDef outputStyle fill:#dcfce7,stroke:#16a34a,stroke-width:2px,color:#15803d
class report outputStyle
class deploy_export outputStyle
This example demonstrates: - R for data handling: Loading, EDA, feature engineering, splitting - Python for ML: XGBoost training, hyperparameter search, evaluation - R for reporting: Combining results into a final report - Python for deployment: Packaging model artifacts
Improving Existing Annotations
Real-world codebases often have messy, incomplete annotations. Here’s how to clean them up.
Before: A Messy Starting Point
This ETL script has common problems:
# etl_pipeline.R - typical messy annotations
# put id:"step1", output:"data"
# ^ Problem: Vague ID and output name
raw <- read.csv("sales_2024.csv")
# put id:"2", input:"data"
# ^ Problem: Inconsistent ID style (numeric), output missing
clean <- raw[complete.cases(raw), ]
clean$date <- as.Date(clean$date)
# (No annotation here - important step is undocumented!)
aggregated <- aggregate(amount ~ region, clean, sum)
# put label:"final step"
# ^ Problem: Missing ID, vague label, no input/output
write.csv(aggregated, "regional_sales.csv")Resulting diagram (disconnected, unclear):
flowchart TD
step1["step1"]
node_2["2"]
final_step_1["final step"]
%% Connections
step1 --> node_2
%% Styling
classDef processStyle fill:#ede9fe,stroke:#7c3aed,stroke-width:2px,color:#5b21b6
class step1 processStyle
class node_2 processStyle
class final_step_1 processStyle
Step-by-Step Improvement
Step 1: Audit current state
workflow <- put("etl_pipeline.R", validate = TRUE)
print(workflow) # See what's detected
# Validation warnings will highlight issuesStep 2: Use auto-detection to find gaps
Step 3: Generate annotation templates
put_generate("etl_pipeline.R")
# Outputs suggested annotations based on code patternsStep 4: Apply fixes with naming conventions
| Convention | Example | Benefit |
|---|---|---|
| Descriptive IDs |
extract_sales not step1
|
Self-documenting |
| Verb + noun labels | “Load Sales Data” | Clear action |
| Full file names |
sales_2024.csv not data
|
Traceable |
| Consistent style |
snake_case IDs |
Maintainable |
After: Clean Annotations
# etl_pipeline.R - improved annotations
# put id:"extract_sales", label:"Load Sales Data", \
# node_type:"input", output:"sales_2024.csv"
raw <- read.csv("sales_2024.csv")
# put id:"clean_data", label:"Clean & Validate", \
# input:"sales_2024.csv", output:"clean_sales.internal"
clean <- raw[complete.cases(raw), ]
clean$date <- as.Date(clean$date)
# put id:"aggregate_regions", label:"Aggregate by Region", \
# input:"clean_sales.internal", output:"aggregated.internal"
aggregated <- aggregate(amount ~ region, clean, sum)
# put id:"export_results", label:"Export Regional Report", \
# node_type:"output", input:"aggregated.internal", output:"regional_sales.csv"
write.csv(aggregated, "regional_sales.csv")Resulting diagram (connected, clear flow):
flowchart TD
extract_sales(["Load Sales Data"])
clean_data["Clean & Validate"]
aggregate_regions["Aggregate by Region"]
export_results[["Export Regional Report"]]
%% Connections
extract_sales --> clean_data
clean_data --> aggregate_regions
aggregate_regions --> export_results
%% Styling
classDef inputStyle fill:#dbeafe,stroke:#2563eb,stroke-width:2px,color:#1e40af
class extract_sales inputStyle
classDef processStyle fill:#ede9fe,stroke:#7c3aed,stroke-width:2px,color:#5b21b6
class clean_data processStyle
class aggregate_regions processStyle
classDef outputStyle fill:#dcfce7,stroke:#16a34a,stroke-width:2px,color:#15803d
class export_results outputStyle
Key Improvements Made
| Before | After | Why |
|---|---|---|
id:"step1" |
id:"extract_sales" |
Descriptive, searchable |
output:"data" |
output:"sales_2024.csv" |
Actual file name |
| Missing annotation | Added for aggregation step | Complete workflow |
label:"final step" |
label:"Export Regional Report" |
Specific action |
| No node_type | Explicit input/process/output | Proper diagram shapes |
Workflow for Legacy Code
For existing codebases without any annotations:
# 1. Start with auto-detection
auto_workflow <- put_auto("./legacy_code/", recursive = TRUE)
put_diagram(auto_workflow) # Get initial picture
# 2. Generate annotation suggestions
put_generate("./legacy_code/", output = "clipboard")
# Paste into files and customize
# 3. Add manual annotations for key files
# Focus on main entry points first
# 4. Merge for complete picture
final <- put_merge("./legacy_code/",
merge_strategy = "supplement",
recursive = TRUE)
put_diagram(final, show_source_info = TRUE)Tips for Large Workflows
When working with complex workflows:
- Use meaningful IDs: Choose IDs that reflect the step’s purpose
- Group related files: Organize scripts into subdirectories
-
Use subgraphs: Group related nodes with
show_source_info = TRUE, source_info_style = "subgraph" -
Consider direction: Use
direction = "LR"for wide workflows,direction = "TD"for deep ones -
Show artifacts selectively: Use
show_artifacts = TRUEonly when data lineage matters
# For large workflows, consider:
put_diagram(workflow,
direction = "LR", # Left-to-right for wide pipelines
show_source_info = TRUE, # Show file names
source_info_style = "subgraph",# Group by file
theme = "minimal" # Clean look for complex diagrams
)Try It Yourself
Run the built-in examples:
# Basic example
source(system.file("examples", "reprex.R", package = "putior"))
# Data science workflow
source(system.file("examples", "data-science-workflow.R", package = "putior"))
# Self-documentation (putior documents itself!)
source(system.file("examples", "self-documentation.R", package = "putior"))See Also
Functions used in these examples:
| Function | Purpose | Documentation |
|---|---|---|
put() |
Extract annotations | API Reference |
put_diagram() |
Generate diagrams | API Reference |
put_auto() |
Auto-detect workflows | Features Tour |
put_merge() |
Combine manual + auto | Features Tour |
Related guides:
- Quick Start - Create your first diagram
- Quick Reference - Cheat sheet for daily use
- Annotation Guide - Complete syntax reference
- Features Tour - Interactive diagrams, themes, logging
- Troubleshooting - Common issues and solutions