Real time data statistics, merits and demerits, and value analysis

with the development of the Internet, pay more attention to the real-time information, micro-blog search engine has launched a real-time search function, but the real-time analysis of data for the website is more meaningful to


actually read data report people often want more real-time data better, they want to grasp the site every hour even changes every ten minutes, to the current site knows the situation, to find problems and fast response. But if you ask them to know the real-time changes of website data, or in a certain period of site visits or a sudden leap leap, what can we do? I think most people can’t answer. Just some time ago, I was doing some real time data statistics related work on the website, so I have some ideas to share with you.

The advantages and disadvantages of

real time statistics

did not say first real-time statistics exactly useful or not, first to see if you need to get the statistical data in real time what needs to be done, and the real-time data can give us what that is, real-time statistics Pros and Cons.

first, from a technical point of view, it is clear the real-time data needs more resources, because most web data analysis is required from the calculated clickstream data, there is no ready-made data can obtain the display line. From the click stream data need to calculate and collect, undoubtedly these operations need more cost, especially for large amount of data of large site processing, and real-time data to increase the implementation complexity, and may increase the accuracy of the data are not allowed in some extent.


real-time data can show the ability to deal with data on the technical level, and can provide more abundant report show, even using the dynamic trend chart in the report on real-time refresh, the display is not to say, so sometimes a lot of technical personnel are willing to do the work.

from the data application and analysis point of view, there are many real-time statistical results to show the changes of the site of real-time traffic, traffic time period for which the highest overall or highest active site, also can analyze the number or distribution of each user traffic hours, but the analysis for the website in the end there is much significance? Even know when the site at 8, 9 points have the most online users, what can we do? The pressure test site is obviously not to be done by this way.

so that the individual real-time statistics is more real-time monitoring of state website, for analysis, do not have much practical significance, as can on the website optimization and decision support to see how, at least I don’t think.

noted that Avinash Kaushik mentioned in his book, "Real-Time, Data:, It, s, Not, Really, Re>."

