Jessica Ayers 2023-07–8

Online News Popularity Data Blog Post

What would you do differently?

For this project, I would start building the automation piece by piece instead of doing the entire analysis all at once. The largest amount of roadblocks ran into with coding for this project were with using the paramters to create unique files. If attempted step by step, this may be easier to overcome. The random forest model originally took me hours to run. I am not sure if this was an individual-specific problem, but I would wait to do this until everything else is completed next time around.

What was the most difficult part for you?

The most difficult part of this project for me was automating the files. I was able to create the six files, but when connecting to github the files began by showing up blank because of their size. As a new github user, this was a new problem. This was solved by using the html preview link. In addition, making sure each file was corresponding to a different channel category was quite a challenge. The order of the for loop was very important in this case.

What are your big take-aways from this project?

The big take-aways I have from this project are:

  • Automating R Markdown using paramters does save some time, but with practice.
  • Sometimes creating a new variable makes your work more adaptable and efficient.
  • Different models may perform better with different groups of data.
  • Centering and scaling data must be done on both training and testing data.
  • There is more than one possible criterion to picking the best fit model.

GitHub Pages Repo

Regular Repo


<
Previous Post
Reflection on R
>
Next Post
Machine Learning