Hone logo
Hone
Problems

Sentiment Analysis Inference with a Pre-trained Model

This challenge focuses on implementing machine learning inference in JavaScript using a pre-trained sentiment analysis model. Sentiment analysis is a crucial task in natural language processing, allowing us to determine the emotional tone (positive, negative, or neutral) of a given text. You'll be provided with a simplified model representation and tasked with writing a function to perform inference on new text inputs.

Problem Description

You are given a simplified representation of a sentiment analysis model. This model consists of a vocabulary (a list of words) and a dictionary mapping words to their sentiment scores. The model predicts the sentiment of a new text by calculating a weighted average of the sentiment scores of the words present in the text. Your task is to implement a function predictSentiment that takes a text string as input and returns a sentiment prediction (either "positive", "negative", or "neutral") based on the provided model.

Key Requirements:

  • Tokenization: Split the input text into individual words (tokens).
  • Sentiment Scoring: Look up the sentiment score for each word in the vocabulary. If a word is not found in the vocabulary, assign it a score of 0.
  • Weighted Average: Calculate the weighted average of the sentiment scores of all words in the text.
  • Sentiment Prediction: Based on the weighted average, predict the sentiment as follows:
    • If the average score is greater than 0.2, predict "positive".
    • If the average score is less than -0.2, predict "negative".
    • Otherwise, predict "neutral".
  • Case Insensitivity: The text should be converted to lowercase before processing.

Expected Behavior:

The predictSentiment function should accurately predict the sentiment of a given text based on the provided model. It should handle cases where words are not in the vocabulary gracefully (by assigning a score of 0). The function should be case-insensitive.

Edge Cases to Consider:

  • Empty input text.
  • Text containing punctuation and special characters (ignore these).
  • Text containing words not present in the vocabulary.
  • Text with a mix of positive and negative words.

Examples

Example 1:

Input: "This is a great movie!"
Output: "positive"
Explanation: The text contains "great", which has a positive score. The overall average score will be positive, leading to a "positive" prediction.

Example 2:

Input: "I am feeling very sad today."
Output: "negative"
Explanation: The text contains "sad", which has a negative score. The overall average score will be negative, leading to a "negative" prediction.

Example 3:

Input: "The weather is okay."
Output: "neutral"
Explanation: The text contains "okay", which has a score close to zero. The overall average score will be close to zero, leading to a "neutral" prediction.

Example 4:

Input: ""
Output: "neutral"
Explanation: Empty input should result in a neutral prediction.

Constraints

  • The input text will be a string.
  • The vocabulary and sentiment scores will be provided as constants.
  • The length of the input text can vary.
  • The function must return one of the following strings: "positive", "negative", or "neutral".
  • Performance is not a critical concern for this challenge. Focus on correctness and readability.

Notes

  • You are provided with a simplified model. In a real-world scenario, you would likely use a more sophisticated model and a more robust tokenization process.
  • Consider using regular expressions to remove punctuation and special characters from the input text.
  • The provided vocabulary and sentiment scores are designed to be simple and illustrative.
  • Remember to convert the input text to lowercase before processing.
  • The sentiment scores are relative; the absolute values are less important than their signs and magnitudes relative to each other.
  • The threshold values (0.2 and -0.2) for sentiment prediction can be adjusted as needed.
const vocabulary = {
  "great": 0.8,
  "good": 0.7,
  "amazing": 0.9,
  "excellent": 0.85,
  "bad": -0.7,
  "sad": -0.6,
  "terrible": -0.8,
  "awful": -0.75,
  "okay": 0.1,
  "neutral": 0.0,
  "happy": 0.75,
  "angry": -0.65
};

function predictSentiment(text) {
  if (!text) {
    return "neutral";
  }

  const lowercaseText = text.toLowerCase();
  const words = lowercaseText.split(/\s+/).filter(word => word !== ""); // Split by spaces and remove empty strings

  let totalScore = 0;
  let wordCount = 0;

  for (const word of words) {
    const cleanWord = word.replace(/[^a-z]/g, ''); // Remove punctuation
    if (cleanWord) { // Ensure the word is not empty after cleaning
      const score = vocabulary[cleanWord] || 0;
      totalScore += score;
      wordCount++;
    }
  }

  if (wordCount === 0) {
    return "neutral";
  }

  const averageScore = totalScore / wordCount;

  if (averageScore > 0.2) {
    return "positive";
  } else if (averageScore < -0.2) {
    return "negative";
  } else {
    return "neutral";
  }
}
Loading editor...
javascript