Feedback Discussion

Project Plans

  • Make sure that the external dataset is intentional and directly contributes to the project’s goals.

We updated the project plans to be more detailed, and the usage of the external dataset was mentioned in the technical proposal. Also, on the NLP page, data visualization was applied to help readers to understand the external dataset in depth.

EDA Work

  • The table does not meet the requirement, needed to be formatted

We tried to display tables using the plotly package, the effect of visualization is good, tables are interactive. However, we found it hard to add table captions on them since the program treats it as a figure. Thus, we switched to the tabulate and IPython.display packages for better formatting.

  • Cpations needed

Captions were added to both tables and figures under all the sections. The explanations were revised and cross-references were used for a better understanding and better following for the readers.

  • Provide interactive plots to convey more information

For an interactive visualization, we altered the visualization tools from the matplotlib package to the Plotly package. Data were saved in the format of CSV files, so that these could be directly used and rendered in a qmd file.

  • Repetitive use of barcharts, colors whould be explained

Repeated barcharts were eliminated, and a new plot type, bar-polar plot were introduced to reduce the duplication. Also, we set the colors in the remaining barcharts to be qualitative color types, which will be more suitable for the bar charts we created. The barchart for word count was moved to the NLP page, switched the format to the table, and was used to display the word frequency.

NLP Work

No feedback was received from this part. Basically, we followed the problems found in EDA part, captions were added, plots were interactive, and outcomes were highlighted. Basides of bar plots, two grouped bar-line plots were introduced as refreshing.

ML Work

No feedback was received from this part. We made the same improvement as before. To reduce the repetitive of bar lots, we introduced confusion matrices for different models, hope that will be helpful for understanding the models as well.

Website/results

  • Insights needed to be highlited

We highlighted the most meaningful outcomes under each analytical problems that we accomplished. The executive summaries were added at the beginning of each page, which will be helpful for understanding the insights.

  • Make sure that notebooks are in html format

The links to the notebook were changed from ipynb to html. We deployed an extra GitHub website to save all the notebooks in the HTML file that we needed. After checking, all the links work pretty well, and they are accessible to all the readers.