Sentiment-Analysis

Voice of the Customer (VoC) to enhance customer experience with serverless architecture and sentiment analysis, using Amazon Kinesis, Amazon Athena, Amazon QuickSight, Amazon Comprehend, and ChatGPT-LLMs for sentiment analysis.

View the Project on GitHub

AI - Driven Social Media Dashboard

AWS Implementation Guide

About This Guide

This implementation guide discusses architectural considerations and configuration steps for deploying an AI-Driven Social Media Dashboard on the Amazon Web Services (AWS) Cloud. It includes links to an AWS CloudFormation template that launches, configures, and runs the AWS services required to deploy this solution using AWS best practices for security and availability.

The guide is intended for IT infrastructure architects, administrators, and DevOps professionals who have practical experience architecting on the AWS Cloud.

Overview

Companies can gain valuable insight and deepen brand awareness by analyzing their social media interactions with customers. Using machine learning (ML) and business intelligence (BI) services from Amazon Web Services (AWS), including Amazon Translate, Amazon Comprehend, Amazon Kinesis, Amazon Athena, and Amazon QuickSight, businesses can build meaningful, low-cost social media dashboards to analyze customer sentiment, which can lead to better opportunities for acquiring leads, improve website traffic, strengthen customer relationships, and improve customer service.

To help customers more easily build a natural-language-processing (NLP)-powered social media dashboard for customer feedback, AWS offers the AI-Driven Social Media Dashboard. This solution automatically provisions and configures the AWS services necessary to capture multi-language tweets in near real-time, translate them, and display them on a dashboard powered by Amazon QuickSight.

You can also capture both the raw and enriched datasets and durably store them in the solution’s data lake. This allows data analysts to quickly and easily perform new types of analytics and ML on this data.

Cost

You are responsible for the cost of the AWS services used while running this reference deployment. As of the date of publication, the total cost for running this solution with default settings in the US East (N. Virginia) Region is approximately $190 per month for ingesting 10,000 tweets per day and storing them for one year. Refer to Appendix A for a breakdown of the cost.

Architecture Overview

Deploying this solution builds the following environment in the AWS Cloud.
d1

The AWS CloudFormation template deploys an Amazon Elastic Compute Cloud (Amazon EC2) instance in an Amazon Virtual Private Cloud (Amazon VPC) that ingests tweets from Twitter. An Amazon Kinesis Data Firehose delivery stream loads the streaming tweets into the raw prefix in the solution’s Amazon Simple Storage Service (Amazon S3) bucket. Amazon S3 invokes an AWS Lambda function to analyze the raw tweets using Amazon Translate to translate non-English tweets into English, and Amazon Comprehend to use natural- language-processing (NLP) to perform entity extraction and sentiment analysis.

A second Kinesis Data Firehose delivery stream loads the translated tweets and sentiment values into the sentiment prefix in the Amazon S3 bucket. A third delivery stream loads entities in the entities prefix in the Amazon S3 bucket.

The solution also deploys a data lake that includes AWS Glue for data transformation, Amazon Athena for data analysis, and Amazon QuickSight for data visualization. AWS Glue Data Catalog contains a logical database (ai_driven_social_media_dashboard) which is used to organize the tables for the data in Amazon S3. Athena uses these table definitions to query the data stored in Amazon S3 and return the information to an Amazon QuickSight dashboard.

Solution Components

Tweet Ingestion

The solution’s Amazon Elastic Compute Cloud (Amazon EC2) instance has a Node.js application that monitors tweets for a list of terms you specify during initial deployment. When the solution finds a tweet containing one or more of the terms, the solution will ingest that tweet. You can modify the terms that will be pulled from the Twitter streaming API. By default, this solution uses stream processing for tweets. After tweet ingestion, AWS Lambda analyzes your tweets using Amazon Translate and Amazon Comprehend.

To retrieve tens or hundreds of tweets per second, you can perform batch calls or leverage AWS Glue with triggers to perform batch processing.

Social Media Data Lake

This solution includes a data lake to store your tweet data. The data lake consists of Amazon S3 to store raw and enriched datasets, Amazon Kinesis Data Firehose delivery streams to write the ingested tweet data to the data lake, and AWS Glue Data Catalog to be the metadata catalog for analytics. By default, this solution uses Amazon Athena to query data in the data lake. But, you can extend this solution to use Amazon Redshift Spectrum, Amazon EMR, and Amazon SageMaker.

Design Considerations

Supported Languages

By default, this solution can ingest tweets in English, Spanish, German, French, Arabic, and Portuguese. To add additional languages, add the language code to the list in the Twitter Language AWS CloudFormation template parameter.

Note that this solution does not automatically map the language codes in Twitter to the codes in Amazon Translate. The solution includes a default set of language codes that match in Twitter and Amazon Translate. Customers who want to add additional languages should either verify that Twitter and Amazon Translate use the same language code, or modify the included Lambda function to map the languages. For a list of codes for Amazon Translate, see Supported Language Codes in the Amazon Translate Developer Guide.

Data Visualization

You can use Amazon QuickSight to build dashboards that enable you to visualize tweets over time, the sentiment of the tweets, and the relationship between the entities being discussed and the sentiment values from the tweets. For more information on how to leverage Amazon QuickSight to visualize tweet data, see Step 4 of the Automated Deployment section.

Stack Deletion

The AI-Driven Social Media Dashboard is designed to enable you to retain the tweet data stored in Amazon S3. If you delete the solution stack, the Amazon S3 bucket with your tweet data will not be deleted. You must manually delete this bucket.

Regional Deployment

This solution uses Amazon QuickSight, Amazon Athena, Amazon Translate, Amazon Comprehend, and AWS Glue which are currently available in specific AWS Regions only. Therefore, you must launch this solution in an AWS Region where these services are available. For the most current service availability by AWS Region, see AWS service offerings by regions.

View template

This is the primary solution template you use to launch AI-Driven Social Media
Dashboard and all associated components. The default configuration deploys an Amazon Elastic Compute Cloud (Amazon EC2) instance, Amazon Kinesis Data Firehose delivery streams, an AWS Lambda function, Amazon Athena, AWS Glue Data Catalog, Amazon Translate, and Amazon Comprehend, but you can also customize the template based on your specific needs.

Automated Deployment

Before you launch the automated deployment, please review the architecture, configuration, and other considerations discussed in this guide. Follow the step-by-step instructions in this section to configure and deploy the AI-Driven Social Media Dashboard template into your account.

Time to deploy: Approximately five minutes

Prerequisites

Before you launch this solution, you must have a Twitter consumer key (API key) and secret, and a Twitter access key and secret. If you do not already have these keys, you must create an app in Twitter.

You must also have an Amazon Elastic Compute Cloud (Amazon EC2) key pair. If you do not already have a key pair, see Creating a Key Pair Using Amazon EC2 in the Amazon EC2 User Guide for Linux Instances.

What We’ll Cover

The procedure for deploying this architecture on AWS consists of the following steps. For detailed instructions, follow the links for each step.

[Step 1. Launch the Stack]

[Step 2. Build the Queries]

[Step 3. Create the Data Source]

[Step 4. Build the Dashboard]