tree: 9bccee93529033bf6f82606e1af54659f9129306 [path history] [tgz]
  1. test_data/
  2. amqp_detail.py
  3. common.py
  4. log_splitter.py
  5. nicknamer.py
  6. parser.py
  7. README.md
  8. router.py
  9. scraper.py
  10. seq-diag-gen.py
  11. test_data.py
  12. text.py
tools/scraper/README.md

Scraper - Analyze qpid-dispatch log files

Scraper provides two analysis modes:

  • Normal mode: combine logs and show details.
  • Split mode: split a single log into per-connection data and show details.

Details are written to stdout in html format.

Apache License, Version 2.0

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Normal Mode

Scraper is a data scraping program. It reads qpid-dispatch router log files, categorizes and sorts the data, and produces an HTML summary. In normal mode Scraper can operate on a single log file or on many log files.

Normal mode does not write any files to your file system. The result is printed on stdout.

From each log file Scraper extracts:

  • Router version
  • Router container name
  • Router restart times. A single log file may contain data from several router reboot instances.
  • Router link state calculation reports
  • Interrouter and client connections
  • AMQP facts
    • Connection peers
    • Link pair establishment
    • Transfer traffic
    • Message disposition
    • Flow and credit propagation

Scraper sorts these facts with microsecond precision using the log timestamps.

Then Scraper merges the data from any number (as long as that number is less than 27!) of independent log files into a single view.

Next Scraper performs some higher-level analysis.

  • Routers are identified by letter rather than by the container name: ‘A’, ‘B’, and so on. Log data in a file is grouped into instances and is identified by a number for that router instance: ‘A0’, ‘A1’, and so on.

  • Per router each AMQP data log entry is sorted into per-connection data lists.

  • Connection data lists are searched to discover router-to-router and router-to-client connection pairs.

  • Per connection data are subdivided into per-session and per-link lists, sorting the AMQP data into per-link-only views. This information is shown in the Connection Details section. Clicking on the lozenge will expand a connection into a view that shows all the sessions. Clicking on a session lozenge will expand into a view that shows all the links.

    Each link shows a lozenge that will toggle the link data visibility and a hyperlink that will go to the link data. When the table of links is small and visible on one screen then it‘s easy to see what’s going on. But when the table of links is large, say 2000 links, then the data for each link is 2000 lines later in the web page. When you click the lozenge you don't see the data. For larger data sets the usage rule is to first click the lozenge to expose the data, and then click the hyperlink go make the data visible.

  • For each link the first AMQP Attach is examined to discover the link's source or target address. All the addresses named in all the links are aggregated into a table.

    To view the link traffic first click the lozenge in the Link column to make the data visible and then click the hyperlinked link name to see the data.

  • Bulk AMQP data may be shown or hidden on arbitrary per-connection selections. Note: This feature is slow when a few tens of thousands of lines are being shown or hidden.

  • Noteworthy AMQP frames are identified. By hand these are hard to find.

    • AMQP errors
    • Presettled transfers
    • Transfers with ‘more’ bit set
    • Resumed transfers
    • Aborted transfers
    • Flow with ‘drain’ set
    • Probable unsettled transfers.
    • Possible unsettled transfers. When log files are truncated and not all log lines for this connection, session, and link are present then settlement calculations are compromised. Transfers may appear to be unsettled but the settlement could not be correlated properly with the transfer.
  • Transfer messages are sorted by signature. Then a table is made showing where each message leaves or arrives over a connection. The signature is limited to the body of the message as shown in the Qpid Proton log text.

  • Settlement state for each unsettled transfer is identified, displayed, and shown with delta and elapsed time values. See example in the Advanced section.

  • Link name propagation for each named link is shown in a table.

  • Router, peer, and link names can get really long. Nicknames for each are used with popups showing the full name.

  • Transfer data is identified with nicknames but without the popups. The popups were so big that Firefox refused to show them; so forget it and they weren't useful anyway.

  • Router link state cost calculations are merged with router restart records to create a comprehensive link state cost view. Routers may publish cost reports that do not include all other routers. In this case the other routers are identified visually to indicate that they are unreachable.

Split Mode

Split mode was developed to help analyze log files that are generally too big to be processed by Scraper normal mode. Split mode works equally well on both large and small log files.

Split mode rewrites all of the log file's AMQP log data into many, possibly tens of thousands, smaller files. Data is confined to a subdirectory named after the original log file. Scraper will not overwrite existing output directories.

Log fileScraper output directory
EB2.logEB2.log.splits

Split mode directs Scraper to accept a single log file and break it into per-connection data files. The connections are counted and the volume of data handled by each connection is noted. Then each connection's data is written to a subfolder whose name indicates the amount of data the connection handled.

Log linesDirectory
25010e3
45,00010e5
1,234,56710e7

Split mode also analyzes:

  • Each router reboot starts a new router instance. Split mode observes the reboots and divides the connection data. Connection names inclued the router instance ID.

    A<instance>_<connectionId>

  • Connection peers are identified. Scraper currently recognizes these peers:

    • Inter-router
    • Apache activemq-artemis broker
    • Apache qpidd broker
    • Apache Qpid-JMS client
  • Scraper searches all log lines for AMQP errors and produces a table identifying them.

  • Connections are listed in order based on

    • Total transfer count
    • Total log line count
    • Connections with zero transfers
  • Per-connection files are linked from the connection lists for one-click viewing

  • AMQP Addresses from every every AMQP Attach are indexed. A table for each address shows when the address was referenced and some connection details.

AMQP performative legend

Scraper decorates performative display lines with important AMQP values.

PerformativeDecorations
open[0]
close[0]
begin[channel, remoteChannel]
end[channel]
attach[channel, handle] role linkName (source: src, target: tgt)
detach[channel, handle]
flow[channel, handle] (deliveryCount, linkCredit)
transfer[channel, handle] (deliveryId) .. [flags] length (settement state)
disposition[channel] (role firstId-lastId settleFlags settleState)

Sequence generator

The sequence generator (switches --sequence or -sq) generates some extra output that is useful for making sequence diagrams from the raw data. This is an experimental feature.

When a sequence switch is set then the AMQP performatives are simplified and dumped into the html data stream near the end.

A user then identifies what parts of the log data he needs and then edits the lines in question from the html data stream file. Careful attention to the start and end timestamps will help a user find the data more easily. Save the selected data into an intermediate file.

Finally the intermediate file is processed by new file tools/scraper/seq-diag-gen.py to produce the sequence diagram source file.

Items to do to make this a reliable feature:

  • In most cases the the non-instantaneous messages do not exactly line up on a precisely between the sender‘s and receiver’s time lines.
  • The vertical time scale is wildly exaggerated. There is no scale: equal intervals in vertical space could be microseconds or days.
  • Messages from non-router actors are shown as horizontal lines. This tool has no access to those actors' trace logs or timestamps.
  • You may want to manually specify actors and participants to get the vertical time lines in an acceptable order.

Quick Start

  • Enable router logging

The routers need to generate proper logging for Scraper. Expose the information in router logs that Scraper requires by enabling specific log levels for some of the router logging modules.

Log module and levelInformation exposed in logs
ROUTER infoRouter version
SERVER infoRouter restart discovery
SERVER traceAMQP control and data - versions 1.10 and earlier
PROTOCOL traceAMQP control and data - versions 1.11 and later
ROUTER_LS infoRouter link state reports
  • Enable version-specific router logging
VersionSERVER tracePROTOCOL trace
1.10 and earlierRequiredError - do not specify
1.11 and laterDo not specify - clutters log file with info that Scraper does not useRequired
  • Run your tests to populate log files used as Scraper input.

  • Run Scraper to generate web content

    tools/scraper/scraper.py -f somefile.log > somefile.html

    tools/scraper/scraper.py -f *.log > somefile.html

  • Profit

    firefox somefile.html

Scraper command line

usage: scraper.py [-h] [--skip-all-data] [--skip-detail] [--skip-msg-progress]
                  [--split] [--time-start TIME_START] [--time-end TIME_END]
                  [--files FILES [FILES ...]]

optional arguments:
  -h, --help            show this help message and exit
  --skip-all-data, -sa  Max load shedding: do not store/index transfer,
                        disposition, flow, or EMPTY_FRAME data
  --skip-detail, -sd    Load shedding: do not produce Connection Details
                        tables
  --skip-msg-progress, -sm
                        Load shedding: do not produce Message Progress tables
  --split, -sp          A single file is split into per-connection data.
  --time-start TIME_START, -ts TIME_START
                        Ignore log records earlier than this. Format:
                        "2018-08-13 13:15:00.123456"
  --time-end TIME_END, -te TIME_END
                        Ignore log records later than this. Format:
                        "2018-08-13 13:15:15.123456"
  --files FILES [FILES ...], -f FILES [FILES ...]
  • Split mode works with a single file and ignores all other switches
  • Normal mode (no --split switch) accepts other switches and multiple files

Switch --skip-all-data

tools/scraper/scraper.py --skip-all-data -f FILE [FILE ...]

With --skip-all-data AMQP transfer, disposition, and flow frames in the log files are discarded. The resulting web page still includes lots of useful information with connection info, link name propagation, and link state analysis.

Switch --skip-detail

tools/scraper/scraper.py --skip-detail -f FILE [FILE ...]

With --skip-detail the display of per-connection, per-session, per-link tables is skipped.

Switch --skip-msg-progress

tools/scraper/scraper.py --skip-msg-progress -f FILE [FILE ...]

With --skip-msg-progress the display of transfer analysis tables is skipped.

Advanced

  • Merging multiple qpid-dispatch log files

Scraper accepts multiple log files names in the command line and merges the log data according to the router log timestamps.

tools/scraper/scraper.py --files A.log B.log C.log > abc.html

Note that the qpid-dispatch host system clocks for merged log files must be synchronized to within a few microseconds in order for the result to be useful. This is easiest to achieve when the routers are run on the same CPU core on a single system. Running Fedora 27 and 28 on two hosts in a router network where the routers run ntp to the same time provider produces perfectly acceptable results.

Scraper does a decent job merging log files created within a qpid-dispatch self test.

  • Wow, that's a lot of data

Indeed it is and good luck figuring it out. Sometimes, though, it's too much. The AMQP transfer data analysis is the worst offender in terms of CPU time, run-time memory usage, and monstrous html output files. Scraper provides command line switches to turn off sections of the data analysis.

  • How to read the transfer analysis tables. Here's an instance:
SrcTimeRtrConnIdDirConnIdPeerT deltaT elapsedSettlementS elapsed
A0_203509:50:52.027975A0A0_11<-peer_70.0000000.000000(accepted settled 0.005142 S)0.005142
A0_207109:50:52.028556A0A0_6->D0_4D0.0005810.000581(accepted settled 0.004253 S)0.004834
D0_158709:50:52.028696D0D0_4<-A0_6A0.0001400.000721(accepted settled 0.003988 S)0.004709
D0_161209:50:52.029260D0D0_1->C0_6C0.0005640.001285(accepted settled 0.003044 S)0.004329
C0_161009:50:52.029350C0C0_6<-D0_1D0.0000900.001375(accepted settled 0.002846 S)0.004221
C0_162509:50:52.029672C0C0_1->B0_5B0.0003220.001697(accepted settled 0.002189 S)0.003886
B0_143809:50:52.029760B0B0_5<-C0_1C0.0000880.001785(accepted settled 0.002002 S)0.003787
B0_145109:50:52.030117B0B0_7->peer_70.0003570.002142(accepted settled 0.001318 S)0.003460

Each row in this table represents the facts about when a single transfer and its corresponding settlement was seen entering or exiting a router.

FieldContents
SrcRouter instance and file line number where the transfer was seen
Timetimestamp
RtrRouter letter id and instance
ConnIdRouter connection id
Dirtransfer direction. <- indicates into the router, -> indicates out of the router
ConnIdpeer's connection id. Blank if the peer is a normal client and not a router.
PeerPeer‘s name. peer7 whold show the peer’s container name in a popup.
T deltaTime since previous row
T elapsedTime since the message first entered the system
SettlementSettlement state and time delta since message time in column 2 for this row. The settlement disposition log line is hyperlinked from the word accepted.
S elapsedSettlement elapsed time. This is the difference between the accepted disposition log record and the time when the message first entered the system.

Row-by-row it is easiest to read the each line from left to right

  • A0 connecton 11 received the transfer from peer_7.
  • A0 connection 6 sent the message to D0 connection 4.
  • D0 connection 4 received the message from A0 connection 6.

and so on. This message came from a sender on peer_7, went through routers A, D, C, and B, and finally was returned to a listener on peer_7. The receiver received the message 0.002142 S after the sender sent it. The sender received the accepted disposition 0.005142 S after the sender sent the message.

The transmit times are in order from top to bottom and the settlement times are in order from bottom to top.