How to conduct effective data analysis-Common Problem-php.cn

Home

Common Problem

How to conduct effective data analysis

angryTom

Jul 22, 2019 am 11:44 AM

Big Data

How to conduct effective data analysis

Recommended tutorial: Python tutorial

In the second half of the Internet, there are constant In the context of refined operations, product managers no longer just rely on feelings to make products, but also need to cultivate data awareness and be able to use data as a basis to continuously improve products.

Unlike the company’s professional data analysts, product managers can look at data more from the user and business levels, and find the reasons for data changes faster and more thoroughly.

So how to effectively analyze the data under the premise that the data has been effectively recorded?

1. Clarify the purpose of data analysis

## 1. If the purpose of data analysis is to compare the page before and after revision To determine the pros and cons, the indicators to be measured should start from the click-through rate, bounce rate and other dimensions of the page. E-commerce applications should also observe the order conversion rate. Social applications should focus on the user's visit duration, likes, forwarding interactions, etc.

When many newcomers design their own products, they may spend a lot of time on the design of the product itself, but do not spend energy thinking about how to measure the success of the product. In the product documentation Writing an empty phrase like "the user experience has been improved" is not conducive to the smooth passing of the product design review, nor is it possible to quickly improve the KPI indicators of the product more effectively.

　2. If the purpose of data analysis is to explore the reasons for abnormal data fluctuations in a certain module, the analysis method should be gradually dismantled according to the pyramid principle, version->time-> crowd.

For example, if you find that the click-through rate of the Guess You Like module on the homepage has recently dropped from 40% to 35%, a sharp drop of 5%, at this time, first check to see which version of the data has occurred. Is the fluctuation caused by omissions or errors in the launch of the new version?

If the version fluctuation data remains consistent, then look at when the data started to change. Is it because of the Christmas and New Year’s Day holiday factors that other modules on the page are online? The new activity affected guess-you-like conversions.

If not, then break it down to see if the composition of traffic sources has changed, and whether it is caused by an increase in the number of new users being exposed.

Product managers need to analyze data with a clear purpose and think about what dimensions need to be constructed to verify to achieve the goal. Most of the time, product managers need to be very patient and dismantle the subdivisions step by step to investigate the reasons.

2. Multi-channel data collection

There are generally four types of collection methods.

　1. Obtain from external industry data analysis reports such as Analysys or iResearch. You need to observe the data with a cautious attitude and extract effective and accurate information. , peel off some data that may be filled with water, and you need to always be wary of secondary data that has been processed by others.

　2. Actively collect user feedback from AppStore, customer service feedback, Weibo and other community forums. When I have free time, I often go to the community forum to read the status comments of users. Generally, such comments are very extreme, either very good or scolding, but these comments are still very beneficial to the improvement of my own product design. Yes, you can try to infer why the user had such emotions at that moment.

　3. Participate in questionnaire design, user interviews and other surveys by yourself, face users directly, collect first-hand data, and observe the problems and feelings users encounter when using products. The questionnaire needs to refine the core questions and reduce the number of questions, and the recycling results need to eliminate ineffective and perfunctory questionnaires. User interviews need to be careful not to use guiding words or questions to bias the user's natural feelings.

　4. Study data from recorded user behavior trajectories. Large companies generally have fixed-line reports/emails to provide daily or even real-time feedback on online user data. They also provide SQL query platforms to product managers or data analysts so that they can explore and compare data in a more in-depth manner.

3. Effectively eliminate interference data

　1. Select the correct number of samples and select a large enough number to eliminate the influence of extreme or accidental data. In the 2008 Olympics, Yao Ming's three-point shooting percentage was 100%, and Kobe's three-point shooting percentage was 32%. Does that mean that Yao Ming's three-point shooting percentage was higher than Kobe's? There is a problem with the display, because in that Olympics, Yao Ming only shot one three-pointer, and Kobe shot 53.

　2. Develop the same sampling rules to reduce the bias of analysis conclusions. For example, two Push copywritings, the first one "You have a heart-warming takeaway red envelope that you have not received. The biggest red envelope is only reserved for you who are the best at eating, click to enter", the second one "I will give you a takeaway low-temperature benefit without leaving the house." Households can enjoy hot and delicious food, click to collect.” Experimental data shows that the click-through rate of the second Push copy is 30% higher than that of the first one. So is it really the second copy that is more attractive? It turned out that the activity of the recipients of the second Push copy was significantly higher than that of the first one.

　3. Excluding the interference of versions or holiday factors, the data performance when a new version is first launched is often very good, because users who actively upgrade are generally highly active users. When weekends or major holidays are approaching, users' consumption needs will be triggered, and the order conversion rate of e-commerce applications will also rise sharply. Therefore, when comparing data, the data of the experimental group and the control group should remain consistent in the time dimension.

　4. Forgetting historical data. Human beings are different from data technology. Data technology has 100% memory ability, while humans can only remember 33% after 1 day, 25% after 6 days, and 21% after 31 days according to Ahobins' Law of Forgetting. Therefore, we must choose the screening time period reasonably. For example, the Guess You Like module not only performs a certain weighting process on the scoring of interest tags, but also conducts a series of regression experiments based on factors such as the life cycle of the product to obtain the decline curve of the audience's various interests and purchasing tendencies. Use regular time changes to effectively delete old data to improve the click-through rate of the module.

5. The experiment needs to split the A1 group, that is, add another group A1 to the experimental group B and the control group A. The rules of A1 and A should be consistent, and then explore the rules of AB. Comparing data fluctuations with AA1, eliminates the impact of natural/abnormal fluctuations in data. My actual A/B experiment shows that it is very important and necessary to set up the A1 group. No matter how big the data magnitude is, the data of the two groups with the same experimental rules will also have certain small fluctuations, and this small fluctuation is in the refinement process. Today's operations may cause major interference bias in our judgment.

4. Review the data reasonably and objectively

##　1. Don’t ignore silent users

Product managers make decisions when they hear feedback from some users and spend a lot of time developing corresponding functions. Often, as a result, these functions may only be the urgent needs of a very small number of users. And most users don't care. It may even be contrary to the demands of core users, causing data to plummet after the new version of the product is launched.

Ignoring silent users and failing to comprehensively consider the core needs of most target users of the product may result in a waste of manpower and material resources, or worse, missed business opportunities.

　2. Comprehensively understand the data results

## If there is an obvious difference between the expectations of the experimental results and our experience and cognition Bias, please do not blindly jump to conclusions and question your intuition, but try to conduct a more thorough analysis of the data.

For example, I once conducted an experiment to deliver active pop-ups to users on the homepage. I found that the data of the experimental group improved in terms of homepage click-through rate, order conversion rate and even 7-day retention rate. Far exceeding the control group, the conversion rate of each module on the homepage has been significantly improved, far exceeding our expectations. So is this really an active pop-up that stimulates the user's conversion rate?

Later we found that users who can display active pop-ups on the homepage tend to have better network conditions when using the environment. In a wifi environment, users who do not display pop-ups are It may be that in mobile scenarios such as buses, subways, and shopping malls, network communication may be poor, thus affecting the results of the A/B experiment.

　3. Don’t rely too much on data

Over-reliance on data, on the one hand, will make us do a lot of things that are of no value Data analysis; on the other hand, it will also limit the inspiration and creativity that product managers should have.

Just as Luo Zhenyu mentioned in the New Year’s Eve speech of Friends of Time. Give users whatever they want. You can guess it even before they say it. This is called the maternal love algorithm. No one does it better than Toutiao in the field of content distribution. However, the maternal love algorithm has big drawbacks. In the recommendation It will become narrower and narrower as time goes by.

On the other side is the algorithm of fatherly love, standing high and seeing far. Tell users, put down the crap in your hands, I will tell you a good thing, follow me. Just like the iPhone series products created by Qiao Bangzhu back then, he did not look at market analysis or conduct user research to create products that exceeded user expectations.

##5. Summary

Netflix, the most successful video website in the United States, uses big data to analyze user habits. The analysis went deep into the creative process of the film, shaping the popular American drama "House of Cards". However, Netflix staff told us that we should not be obsessed with big data

If a TV series with a score of 9 is considered a high-quality product, big data can save us from the risk of a low score of 6 or less, but It will also lead us step by step towards mediocrity, the vast majority of which is between 7-8 points.

The above is the detailed content of How to conduct effective data analysis. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Assassin's Creed Shadows: Seashell Riddle Solution

4 weeks ago By DDD

What's New in Windows 11 KB5054979 & How to Fix Update Issues

3 weeks ago By DDD

Where to find the Crane Control Keycard in Atomfall

4 weeks ago By DDD

Roblox: Dead Rails - How To Complete Every Challenge

1 months ago By DDD

How to fix KB5055523 fails to install in Windows 11?

2 weeks ago By DDD

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Where is the login entrance for gmail email?

7717

Java Tutorial

1641

CakePHP Tutorial

1396

Laravel Tutorial

1289

PHP Tutorial

1233

Related knowledge

PHP's big data structure processing skills May 08, 2024 am 10:24 AM

Big data structure processing skills: Chunking: Break down the data set and process it in chunks to reduce memory consumption. Generator: Generate data items one by one without loading the entire data set, suitable for unlimited data sets. Streaming: Read files or query results line by line, suitable for large files or remote data. External storage: For very large data sets, store the data in a database or NoSQL.

Five major development trends in the AEC/O industry in 2024 Apr 19, 2024 pm 02:50 PM

AEC/O (Architecture, Engineering & Construction/Operation) refers to the comprehensive services that provide architectural design, engineering design, construction and operation in the construction industry. In 2024, the AEC/O industry faces changing challenges amid technological advancements. This year is expected to see the integration of advanced technologies, heralding a paradigm shift in design, construction and operations. In response to these changes, industries are redefining work processes, adjusting priorities, and enhancing collaboration to adapt to the needs of a rapidly changing world. The following five major trends in the AEC/O industry will become key themes in 2024, recommending it move towards a more integrated, responsive and sustainable future: integrated supply chain, smart manufacturing

C++ development experience sharing: Practical experience in C++ big data programming Nov 22, 2023 am 09:14 AM

In the Internet era, big data has become a new resource. With the continuous improvement of big data analysis technology, the demand for big data programming has become more and more urgent. As a widely used programming language, C++’s unique advantages in big data programming have become increasingly prominent. Below I will share my practical experience in C++ big data programming. 1. Choosing the appropriate data structure Choosing the appropriate data structure is an important part of writing efficient big data programs. There are a variety of data structures in C++ that we can use, such as arrays, linked lists, trees, hash tables, etc.

Application of algorithms in the construction of 58 portrait platform May 09, 2024 am 09:01 AM

1. Background of the Construction of 58 Portraits Platform First of all, I would like to share with you the background of the construction of the 58 Portrait Platform. 1. The traditional thinking of the traditional profiling platform is no longer enough. Building a user profiling platform relies on data warehouse modeling capabilities to integrate data from multiple business lines to build accurate user portraits; it also requires data mining to understand user behavior, interests and needs, and provide algorithms. side capabilities; finally, it also needs to have data platform capabilities to efficiently store, query and share user profile data and provide profile services. The main difference between a self-built business profiling platform and a middle-office profiling platform is that the self-built profiling platform serves a single business line and can be customized on demand; the mid-office platform serves multiple business lines, has complex modeling, and provides more general capabilities. 2.58 User portraits of the background of Zhongtai portrait construction

Getting Started Guide: Using Go Language to Process Big Data Feb 25, 2024 pm 09:51 PM

As an open source programming language, Go language has gradually received widespread attention and use in recent years. It is favored by programmers for its simplicity, efficiency, and powerful concurrent processing capabilities. In the field of big data processing, the Go language also has strong potential. It can be used to process massive data, optimize performance, and can be well integrated with various big data processing tools and frameworks. In this article, we will introduce some basic concepts and techniques of big data processing in Go language, and show how to use Go language through specific code examples.

Discussion on the reasons and solutions for the lack of big data framework in Go language Mar 29, 2024 pm 12:24 PM

In today's big data era, data processing and analysis have become an important support for the development of various industries. As a programming language with high development efficiency and superior performance, Go language has gradually attracted attention in the field of big data. However, compared with other languages such as Java and Python, Go language has relatively insufficient support for big data frameworks, which has caused trouble for some developers. This article will explore the main reasons for the lack of big data framework in Go language, propose corresponding solutions, and illustrate it with specific code examples. 1. Go language

AI, digital twins, visualization... Highlights of the 2023 Yizhiwei Autumn Product Launch Conference! Nov 14, 2023 pm 05:29 PM

Yizhiwei’s 2023 autumn product launch has concluded successfully! Let us review the highlights of the conference together! 1. Intelligent inclusive openness, allowing digital twins to become productive Ning Haiyuan, co-founder of Kangaroo Cloud and CEO of Yizhiwei, said in his opening speech: At this year’s company’s strategic meeting, we positioned the main direction of product research and development as “intelligent inclusive openness” "Three core capabilities, focusing on the three core keywords of "intelligent inclusive openness", we further proposed the development goal of "making digital twins a productive force". 2. EasyTwin: Explore a new digital twin engine that is easier to use 1. From 0.1 to 1.0, continue to explore the digital twin fusion rendering engine to have better solutions with mature 3D editing mode, convenient interactive blueprints, and massive model assets

Golang and big data: a perfect match or at odds? Mar 05, 2024 pm 01:57 PM

Golang and big data: a perfect match or at odds? With the rapid development of big data technology, more and more companies are beginning to optimize business and decision-making through data analysis. For big data processing, efficient programming languages are crucial. Among many programming languages, Golang (Go language) has become one of the popular choices for big data processing due to its concurrency, efficiency, simplicity and other characteristics. So, are Golang and big data a perfect match or contradictory? This article will start from the application of Golang in big data processing,