| <HTML> |
| <HEAD> |
| <TITLE>Notes on Webalizer for netbeans.org</TITLE> |
| <META NAME="description" CONTENT="Webalizer Notes"> |
| <link rel="stylesheet" type="text/css" href="/netbeans.css"> |
| </HEAD> |
| |
| <BODY> |
| |
| <A NAME="webalizer-defs"><h1>Webalizer</h1></A> |
| <BR><A HREF="http://www.mrunix.net/webalizer/">Webalizer</A> is an |
| httpd logfile analysis tool, which netbeans.org uses to track website |
| traffic. |
| |
| <P>Analysis of traffic for each individual module's website is available |
| at <A HREF="https://netbeans.org/download/webstats/index.html">https://netbeans.org/download/webstats/index.html</A> ; |
| these results are uploaded daily. |
| |
| <P><h2>Webalizer Configuration</h2> |
| <BR>Webalizer makes use of config files, which control what exactly |
| is displayed on the results pages. A separate config file is used |
| for each module on netbeans.org, so it is possible to customise the |
| Webalizer results per-module. |
| |
| <P>If you are a module owner, and you'd |
| like to make some changes to your Webalizer config file, first take |
| a look at your existing config file, to get an idea of what is |
| possible. There are links to the config files from each module's |
| results page. Next check out the <A HREF="ftp://ftp.mrunix.net/pub/webalizer/README/">Webalizer Readme</A>, |
| where config files and options are explained in detail. Finally, |
| <a href="https://netbeans.org/about/contact_form.html?to=1">let us know</A> what you're |
| interested in! We can't guarantee that any request will be implemented, |
| but we'll try. |
| |
| <P><h2><a name="definitions">Webalizer Definitions</a></h2> |
| <BR>From the <A HREF="ftp://ftp.mrunix.net/pub/webalizer/README/">Webalizer Readme</A> : |
| |
| <P><B>Hits</B> |
| <BR>Any request made to the server which is logged, is considered a 'hit'. |
| The requests can be for anything... html pages, graphic images, audio |
| files, CGI scripts, etc... Each valid line in the server log is |
| counted as a hit. This number represents the total number of requests |
| that were made to the server during the specified report period. |
| |
| <P><B>Files</B> |
| <BR>Some requests made to the server, require that the server then send |
| something back to the requesting client, such as a html page or graphic |
| image. When this happens, it is considered a 'file' and the files |
| total is incremented. The relationship between 'hits' and 'files' can |
| be thought of as 'incoming requests' and 'outgoing responses'. |
| |
| <P><B>Pages</B> |
| <BR>Pages are, well, pages! Generally, any HTML document, or anything |
| that generates an HTML document, would be considered a page. This |
| does not include the other stuff that goes into a document, such as |
| graphic images, audio clips, etc... This number represents the number |
| of 'pages' requested only, and does not include the other 'stuff' that |
| is in the page. What actually constitutes a 'page' can vary from |
| server to server. The default action is to treat anything with the |
| extension '.htm', '.html' or '.cgi' as a page. A lot of sites will |
| probably define other extensions, such as '.phtml', '.php3' and '.pl' |
| as pages as well. Some people consider this number as the number of |
| 'pure' hits... I'm not sure if I totally agree with that viewpoint. |
| Some other programs (and people :) refer to this as 'Pageviews'. |
| |
| <P><B>Sites</B> |
| <BR>Each request made to the server comes from a unique 'site', which can |
| be referenced by a name or ultimately, an IP address. The 'sites' |
| number shows how many unique IP addresses made requests to the server |
| during the reporting time period. This DOES NOT mean the number of |
| unique individual users (real people) that visited, which is impossible |
| to determine using just logs and the HTTP protocol (however, this |
| number might be about as close as you will get). |
| |
| <P><B>Visits</B> |
| <BR>Whenever a request is made to the server from a given IP address |
| (site), the amount of time since a previous request by the address |
| is calculated (if any). If the time difference is greater than a |
| pre-configured 'visit timeout' value (or has never made a request before), |
| it is considered a 'new visit', and this total is incremented (both |
| for the site, and the IP address). The default timeout value is 30 |
| minutes (can be changed), so if a user visits your site at 1:00 in |
| the afternoon, and then returns at 3:00, two visits would be registered. |
| Note: in the 'Top Sites' table, the visits total should be discounted |
| on 'Grouped' records, and thought of as the "Minimum number of visits" |
| that came from that grouping instead. Note: Visits only occur on |
| PageType requests, that is, for any request whose URL is one of the |
| 'page' types defined with the PageType option. Due to the limitation |
| of the HTTP protocol, log rotations and other factors, this number |
| should not be taken as absolutely accurate, rather, it should be |
| considered a pretty close "guess". |
| |
| <P><B>KBytes</B> |
| <BR>The KBytes (kilobytes) value shows the amount of data, in KB, that |
| was sent out by the server during the specified reporting period. This |
| value is generated directly from the log file, so it is up to the |
| web server to produce accurate numbers in the logs (some web servers |
| do stupid things when it comes to reporting the number of bytes). In |
| general, this should be a fairly accurate representation of the amount |
| of outgoing traffic the server had, regardless of the web servers |
| reporting quirks. |
| |
| <P>Note: A kilobyte is 1024 bytes, not 1000 :) |
| |
| |
| |
| </BODY> |
| </HTML> |