Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

Graphviz Diagram Creation: DOT Language Syntax and Practical Examples

Tech May 9 3

Graphviz Overview

Graphviz is an automated graph layout engine capable of generating various output formats including PNG, PDF, and SVG. The tool uses the DOT language to define graphs, nodes, and edges with customizable properties.

Table of Contents

  • DOT language fundamentals
  • Common graphical attributes
  • Practical examples
  • VSCode preview integration
  • Additional resources

DOT Language Structure

Graphviz builds diagrams from three core components: graphs, nodes, and edges. Each element accepts properties that control its visual representation.

The following BNF-style notation describes the DOT language grammar:

Declaration | Structure

  • | -
    graph | [ strict ] (graph | digraph) [ ID ] '{' stmt_list '}'
    stmt_list | [ stmt [ ';' ] stmt_list ]
    stmt | node_stmt | edge_stmt | attr_stmt | ID '
    ='** ID | subgraph
    attr_stmt | (graph | node | edge) attr_list
    attr_list | '[' [ a_list ] ']' [ attr_list ]
    a_list | ID '=' ID [ (';' | ',') ] [ a_list ]
    edge_stmt | (node_id | subgraph) edgeRHS [ attr_list ]
    edgeRHS | edgeop (node_id | subgraph) [ edgeRHS ]
    node_stmt | node_id [ attr_list ]
    node_id | ID [ port ]
    port | ':' ID [ ':' compass_pt ] | ':' compass_pt
    subgraph | [ subgraph [ ID ] ] '{' stmt_list '}'
    compass_pt | (n | ne | e | se | s | sw | w | nw | c | _)

An ID represents a string identifier for naming elements or properties. The naming conventions include:

  1. Alphanumeric sequences [a-zA-Z\200-\377] combined with underscores, where digits cannot appear at the start
  2. Numeric literals $[-]?(.[0-9]+ | [0-9]^+(.[0-9]*)6? $
  3. Double-quoted strings "..."
  4. HTML-formatted strings <>

DOT Language Keywords

  • strict: Prevents duplicate parallel edges in the graph
  • graph: Declares an undirected graph
  • digraph: Declares a directed graph
  • node: Applies default attributes to all subsequent nodes
  • edge: Applies default attributes to all subsequent edges
  • subgraph: Groups elements for organization or layout control

Key Structurla Observations

  1. Every graph must be enclosed within either graph or digraph braces
  2. The statement list can be empty or contain multiple statements; semicolon are optional
  3. Statement types include:
    • node declarations
    • edge declarations
    • subgraph declarations
    • attribute lists
    • ID assignment statements
  4. Attribute lists require square bracket delimiters with optional content
  5. Individual atributes follow the key = value format with optional commas or semicolons
  6. Edge declarations chain multiple nodes or subgraphs with a single optional attribute list
  7. Node declarations support port-based connections for complex layouts

Node Ports and Compass Directions

Nodes can define connection points using compass directions:

digraph connections {
    element0 [label = "<top> header|<middle> content|<bottom> footer", height=.5]
    element1 [shape = box label="target"]

    element0:top:n -> element1:n [label = "n"]
    element0:top:ne -> element1:ne [label = "ne"]
    element0:top:e -> element1:e [label = "e"]
    element0:top:se -> element1:se [label = "se"]
    element0:top:s -> element1:s [label = "s"]
    element0:top:sw -> element1:sw [label = "sw"]
    element0:top:w -> element1:w [label = "w"]
    element0:top:nw -> element1:nw [label = "nw"]
    element0:top:c -> element1:c [label = "c"]
    element0:top:_ -> element1:_ [label = "_"]

    element0:middle[style=filled color=lightblue]
}


Direction | Meaning

  • | -
    n | north
    ne | northeast
    e | east
    se | southeast
    s | south
    sw | southwest
    w | west
    nw | northwest
    c | center
    _ | any available position

Graphical Attributes

Common attributes fall into three categories: node properties, edge properties, and subgraph properties.

Global Attribute Defaults

Setting default attributes eliminates repetitive per-element configuration:

digraph demo {
    rankdir = LR
    node [shape=box color=blue]
    edge [color=red]

    item_a
    item_b [color=lightblue]
    item_a -> item_b
    item_b -> item_a [color=green]
}


All elements defined after the defaults inherit those properties unless overridden.

Record-Based Node Structures

Complex internal structures can be rendered using record shapes:

digraph hierarchical {
    node [shape =record, charset = "UTF-8" fontname = "Microsoft YaHei", fontsize = 14]

    country [label = " Country | { Province | { City | { District | { Landmark | { Pavilion | Bridge } | Wetland} | Yuhang } | Ningbo}}"]

    pool_struct [
        color = "cornflowerblue"
        label  = "pool_struct | {
            {d | {
                *last |
                *end |
                *next |
                failed
            }}|
            *max |
            *current |
            *chain |
            *cleanup |
            *log
            }"
    ]
}


Frequently Used Attributes

  • charset: Character encoding, typically "UTF-8"
  • fontname: Font family; use "Microsoft YaHei" for CJK characters to avoid rendering issues
  • fontcolor: Text color
  • fontsize: Text size
  • fillcolor: Background fill for nodes and clusters
  • size: Maximum diagram dimensions
  • label: Text label displayed on elements
  • margin: Diagram padding
  • pad: Minimum drawing area extension in inches
  • style: Element styling (e.g., filled for solid backgrounds)
  • rankdir: Graph orientation: "TB" (top-bottom), "LR" (left-right), "BT" (bottom-top), "RL" (right-left)
  • ranksep: Vertical spacing between ranks in inches
  • ratio: Output image aspect ratio

Node Properties

Default node properties include shape = ellipse, width = .75, height = 0.5 with the node identifier serving as the display label.

Common node attributes:

  • shape: Element shape (box, ellipse, record, etc.)
  • width/height: Element dimensions; when fixedsize = true, these are the final dimensions
  • fixedsize: When false, dimensions adapt to content
  • rank: Vertical positioning constraint for subgraphs:
    • same: All nodes share the same rank
    • min: All nodes at the minimum rank
    • source: Similar to min with additional constraints
    • max: All nodes at the maximum rank
    • sink: Similar to max

Edge Properties

Directed graphs use -> while undirected graphs use --. Multiple connections can be chained with a single attribute list:

digraph {
    rankdir = LR
    source -> middle -> dest[color=green]
}


Relevant edge attributes:

digraph {
    rankdir = LR
    splines = ortho

    a -> b -> c -> d -> f [color = green]
    e -> f -> b -> d [color = blue]
    b -> e -> h[color = red]
}


  • len: Preferred edge length
  • weight: Influences edge straightness; higher values produce straighter lines
  • lhead: Logical edge head target for compound graphs
  • ltail: Logical edge tail source for compound graphs
  • headlabel: Label near the arrow head
  • taillabel: Label near the tail
  • splines: Edge routing style:
    • none: No edges drawn
    • true/spline: Curved or straight lines
    • false/line: Straight line segments
    • polyline: Angular lines
    • curved: Arc curves
    • ortho: Perpendicular lines (horizontal and vertical)
  • dir: Arrow drawing direction

Subgraph and Cluster Usage

Subgraphs must be combined with cluster prefixes for boundary rendering:

digraph G {
    compound = true
    ranksep = 1
    node [shape = record]

    subgraph cluster_hardware {
        label = "hardware"
        color = lightblue
        CPU Memory
    }

    subgraph cluster_kernel {
        label = "kernel"
        color = green
        Init IPC
    }

    subgraph cluster_libc {
        label = "libc"
        color = yellow
        glibc
    }

    CPU -> Init [lhead = cluster_kernel ltail = cluster_hardware]
    IPC -> glibc [lhead = cluster_libc ltail = cluster_kernel]
}


Practical Examples

TCP/IP State Transition Diagram

Two approaches demonstrate different layout strategies:

digraph tcp_states {
    compound=true
    fontsize=10
    margin="0,0"
    ranksep = .75
    nodesep = .65

    node [shape=Mrecord fontname="Inconsolata, Consolas", fontsize=12, penwidth=0.5]
    edge [fontname="Inconsolata, Consolas", fontsize=10, arrowhead=normal]

    "State Machine" [shape = "plaintext", fontsize = 16]

    "CLOSED" -> "LISTEN" [style = bold, label = "Passive open\nSend: <none>"];
    "LISTEN" -> "SYN_RECV" [style = bold, label = "Recv: SYN\nSend: SYN,ACK"]
    "SYN_RECV" -> "ESTABLISHED" [style = bold, label = "Recv: ACK\nSend: <none>", weight = 20]
    "ESTABLISHED" -> "CLOSE_WAIT" [style = bold, label = "Recv: FIN\nSend: ACK", weight = 20]

    subgraph cluster_passive_close {    
        style = dotted
        margin = 10
        passive_close [shape = plaintext, label = "Passive Close", fontsize = 14]
        "CLOSE_WAIT" -> "LAST_ACK" [style = bold, label = "App: close\nSend: FIN", weight = 10]
    }
    "LAST_ACK" -> "CLOSED" [style = bold, label = "Recv: ACK\nSend: <none>"]

    "CLOSED" -> "SYN_SENT" [style = dashed, label = "App: active open\nSend: SYN"];    
    "SYN_SENT" -> "ESTABLISHED" [style = dashed, label = "Recv: SYN,ACK\nSend: ACK", weight = 25]
    "SYN_SENT" -> "SYN_RECV" [style = dotted, label = "Recv: SYN\nSend: SYN,ACK\nSimultaneous open"]
    "ESTABLISHED" -> "FIN_WAIT_1" [style = dashed, label = "App: close\nSend: FIN", weight = 20]
    
    subgraph cluster_active_close {
        style = dotted
        margin = 10
        active_open [shape = plaintext, label = "Active Close", fontsize = 14]

        "FIN_WAIT_1" -> "FIN_WAIT_2" [style = dashed, label = "Recv: ACK\nSend: <none>"]
        "FIN_WAIT_2" -> "TIME_WAIT" [style = dashed, label = "Recv: FIN\nSend: ACK"]
        "FIN_WAIT_1" -> "CLOSING" [style = dotted, label = "Recv: ACK\nSend: <none>"]
        "FIN_WAIT_1" -> "TIME_WAIT" [style = dotted, label = "Recv: SYN,ACK\nSend: ACK"]
        "CLOSING" -> "TIME_WAIT" [style = dotted]
    }
    
    "TIME_WAIT" -> "CLOSED" [style = dashed, label = "2MSL timeout"]
}


A more refined version with better alignment:

digraph tcp_refined {
    compound=true
    margin="0,0"
    ranksep = .75
    nodesep = 1
    pad = .5

    node [shape=Mrecord, charset = "UTF-8" fontname="Microsoft YaHei", fontsize=14]
    edge [charset = "UTF-8" fontname="Microsoft YaHei", fontsize=11, arrowhead = normal]

    CLOSED -> LISTEN [style = dashed, label = "Passive open\nSend: <none>", weight = 100];
    
    "State Machine" [shape = "plaintext", fontsize = 16]

    {
        rank = same
        SYN_RCVD SYN_SENT
        anchor_1 [shape = point, width = 0]
        
        SYN_SENT -> anchor_1 [style = dotted, label = "App close or timeout"]
        SYN_RCVD -> SYN_SENT [style = dotted, dir = back, headlabel = "Recv: SYN\nSend: SYN,ACK\nSimultaneous"]
    }

    LISTEN -> SYN_RCVD [style = dashed, headlabel = "Recv: SYN\nSend: SYN,ACK"]
    SYN_RCVD -> LISTEN [style = dotted, headlabel = "Recv: RST"]
    CLOSED:e -> SYN_SENT [style = bold, label = "Active open\nSend: SYN"]

    {
        rank = same
        ESTABLISHED CLOSE_WAIT
        ESTABLISHED -> CLOSE_WAIT [style = dashed, label = "Recv: SYN,ACK\nSend: ACK"]
    }

    SYN_RCVD -> ESTABLISHED [style = dashed, label = "Recv: ACK\nSend: <none>", weight = 9]
    SYN_SENT -> ESTABLISHED  [style = bold, label = "Recv: SYN,ACK\nSend: ACK", weight = 10]

    {
        rank = same
        FIN_WAIT_1
        CLOSING 
        LAST_ACK
        anchor_2 [shape = point, width = 0]

        FIN_WAIT_1 -> CLOSING [style = dotted, label = "Recv: FIN\nSend: ACK"]
        LAST_ACK -> anchor_2 [style = dashed, label = "Recv: ACK\nSend: <none>"]
    }

    CLOSE_WAIT -> LAST_ACK [style = dashed, label = "App: close\nSend: FIN", weight = 10]

    {
        rank = same
        FIN_WAIT_2  TIME_WAIT
        anchor_3 [shape = point, width = 0]
        TIME_WAIT -> anchor_3 [style = bold, label = "2MSL timeout"]
    }

    ESTABLISHED -> FIN_WAIT_1 [style = bold, label = "App: close\nSend: FIN"]
    FIN_WAIT_1 -> FIN_WAIT_2 [style = bold, headlabel = "Recv: ACK\nSend: <none>", weight = 15]
    FIN_WAIT_2 -> TIME_WAIT [style = bold, label = "Recv: FIN\nSend: ACK", weight = 10]

    CLOSING -> TIME_WAIT [style = dotted, label = "Recv: ACK\nSend: <none>", weight = 15]
    FIN_WAIT_1 -> TIME_WAIT [style = dotted, label = "Recv: ACK\nSend: <none>"]

    anchor_3 -> anchor_2 [arrowhead = none, style = dotted, weight = 10]
    anchor_2 -> anchor_1 [arrowhead = none, style = dotted]
    anchor_1 -> CLOSED [style = dotted]
}


The improved version demonstrates effective use of rank = same for horizontal alignment and weight attributes for controlling edge straightness. Note that using rank constraints may prevent the use of subgraph cluster boundaries for certain grouped elements.

Epoll Internal Data Structures

digraph epoll_diagram {
    compound=true
    margin="0,0"
    ranksep = .75
    nodesep = 1
    pad = .5
    rankdir = LR

    node [shape=record, charset = "UTF-8" fontname="Microsoft YaHei", fontsize=14]
    edge [style = dashed, charset = "UTF-8" fontname="Microsoft YaHei", fontsize=11]

    epoll [shape = plaintext, label = "Epoll Structures and Relationships"]

    eventpoll [
        color = cornflowerblue,
        label = "<eventpoll> struct \n eventpoll |
            <lock> spinlock_t lock; |
            <mutex> struct mutex mtx; |
            <wq> wait_queue_head_t wq; |
            <poll_wait> wait_queue_head_t poll_wait; |
            <rdllist> struct list_head rdllist; |
            <ovflist> struct epitem *ovflist; |
            <rbr> struct rb_root_cached rbr; |
            <ws> struct wakeup_source *ws; |
            <user> struct user_struct *user; |
            <file> struct file *file; |
            <visited> int visited; |
            <visited_list_link> struct list_head visited_list_link;"
    ]

    epitem [
        color = sienna,
        label = "<epitem> struct \n epitem  |
            <rb>struct rb_node rbn;\nstruct rcu_head rcu; |
            <rdllink> struct list_head rdllink; |
            <next> struct epitem *next; |
            <ffd> struct epoll_filefd ffd; |
            <nwait> int nwait; |
            <pwqlist> struct list_head pwqlist; |
            <ep> struct eventpoll *ep; |
            <fllink> struct list_head fllink; |
            <ws> struct wakeup_source __rcu *ws; |
            <event> struct epoll_event event;"
    ]

    epitem2 [
        color = sienna,
        label = "<epitem> struct \n epitem |
            <rb>struct rb_node rbn;\nstruct rcu_head rcu; |
            <rdllink> struct list_head rdllink; |
            <next> struct epitem *next; |
            <ep> struct eventpoll *ep; |
             ··· |
             ··· "
    ]

    eppoll_entry [
        color = darkviolet,
        label = "<entry> struct \n eppoll_entry |
            <llink> struct list_head llink; |
            <base> struct epitem *base; |
            <wait> wait_queue_entry_t wait; |
            <whead> wait_queue_head_t *whead;"
    ]

    epitem:ep -> eventpoll:se [color = sienna]
    epitem2:ep -> eventpoll:se [color = sienna]
    eventpoll:ovflist -> epitem:next -> epitem2:next [color = cornflowerblue]
    eventpoll:rdllist -> epitem:rdllink -> epitem2:rdllink [dir = both]
    eppoll_entry:llink -> epitem:pwqlist [color = darkviolet]
    eppoll_entry:base -> epitem:nw  [color = darkviolet]
}


Outstanding Items

  1. Add cluster boundaries around the active close sequence in the TCP/IP diagram
  2. Add TCP/IP sequence diagram elements

VSCode Preview Integration

  1. Download Graphviz from the official website
  2. Install the "Graphviz Preview" extension in VSCode
  3. Configure the dot executable path in settings.json: "graphvizPreview.dotPath": "path\\to\\graphviz\\bin\\dot.exe"
  4. Create a new .dot file and use the preview button in the editor toolbar

Note: Automated layout works well for most cases, but manual positioning may be preferable for diagrams requiring precise element placement.

References

  1. Graphviz Official Documentation

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.