Generate insights from a Coursera course dataset through GPT-4. Explore, analyze and create visualizations to enhance your data driven decision making. The conversation with the GPT is included below and the link to a Medium article created based on the data is on the left. Enjoy!
Click the Link Above ⬆️
REFLECTION REPORT:
Analytics Plan
This analytics plan is designed to review a Coursera dataset. I will focus on finding insights in the different columns such as skill, ratings, and reviews. The plan is to let the GPT guide me through the process. The idea was to avoid confusing the GPT or prompting it incorrectly. *Do not confuse the GTP with bad prompts
Clean and Prepare data.
Complete a statistical summary and a distribution analysis.
Create new features from 'skills' and 'metadata'.
Find the company offering the most courses.
Examine the relationship between ratings and reviews.
Get good, clean visuals and finish
Document Interaction with GPT
For my interaction with the GPT I wanted to keep it very simple as to not cause chaos with the chat. I started by asking it to tell me about the data structure and any cleaning required. It responded with the different data types and with suggestions to clean up three of the columns which were made up of strings.
I then asked the GPT to guide me through the rest of the data exploration process and to tell me what the next 5 steps should be.
The next step was to 1 by 1 ask the GPT to complete the statistical summary, review counts extraction, text processing (separate skills), categorical data analysis and correlation analysis as it had suggested.
Following exactly what the GPT suggested and in it's own words I got the best results and all the answers were very insight full and most provide a useful visualization. I was able to get useful insights such as ratings distribution, number of courses by organization and correlation between ratings and review count.
I decided to start asking my own questions using based on information I wanted to know and not necessarily what the GPT had suggested. This is where everything fell apart. I think I asked it to answer too many questions at once by listing the questions all at once. As soon as I did that the visualizations began to not appear as meaningful and I was having trouble interpreting them. In addition when I asked it to correct them, the GPT started providing me with images instead of graphs based on the data. I was however able to get the information I did want to know such as which course could you complete the fastest and has the highest rating. Also, I wanted to know which 2 or 3 skills are found together the most.
Fastest courses to complete: Network fundamentals and advanced learning algorithms.
Two skills found together: Computer programming/Python and Leadership/Management.
Experience
Overall my experience was great! The GPT is working the best I have ever experienced to this day. By keeping the prompts simple in the beginning I was able to guide the GPT to help me accomplish my goal. Once I started asking too many things at once I was not able to recover the conversation. In fact I was not able to provide me with any additional chart that I could use. Instead it would just create images of Ai charts. Since the conversation cannot be reset to a earlier point, I suggest possibly using Pandasai for more complicated exploration. In Colab if the line of code you run doesn't work, you can always simply delete it and try again. I am going to keep trying to use the GPT to do more data exploration. I think I need to have a step by step plan and feed it the information slowly to avoid confusion.