sentimental analysis for product rating


Sentiment is a judgment or thought passed based on feeling. Sentiment plays a major role when products from different brands are developed with same quality and how sentiment helps one brand’s product to get a better market then the other. Sentiment analysis is opinion mining that deals with sentiment polarity categorization. The various process involved in sentiment analysis explained below:


Data collection

Online portals like Twitter provides API to extract data but most of other portals won’t provide such mechanism. Scripting languages are used to extract the content from online portal.  In general the data in these cases constitutes rating numbers 1 to 5 or 1 to 10 and rating description. Extracting, analyzing and charting rating numbers are relatively easier than analyzing rating description.


Feature extraction

Bag of words consist of some standard words and those words are compared to the data from review to derive binary feature vector. However this method is not effective on phrases so collocations is done with bigram functions. Bigram help in identifying negation words as they occur as pair or group of words. During feature extraction spell check need to be done to clean up the data. Parts of speech tag identification is key part of feature extraction.



On classification of data there are various methods like Naive Bayes method. It uses Bayes’ Rules to calculate the probability of feature of a vector in a class. This method is little complicated and hard to trace back which probabilities are causing certain classifications. Decision list one other method which operates on rule based tagger and that has advantage of human readability.



Based on the Sentiment analysis, the results are being charted or represented in tabular form. In simple rating numbers analysis extracts the results are charted in graph with products on x axis and review rating number on the y axis. In case of Navie Bayes and Decision list method the results are formatted in tabular column with Features as one of the column and scores on other columns.



Lexicon based approach, in this method for each review the matched negative and positive words from predefined set of sentiment lexicon are being counted. The polarity of the review is calculated based on the counting polar words and assigning polarity values such as positive, negative and neutral to them.



The various challenges in sentiment analysis starts right from data collection. Most of the data are free text and available on HTML pages. The rating numbers and rating description on many of the cases won’t match so simple analysis done using rating numbers are not accurate and this leads to analysis of the rating description using various machine language learning tools. Analysis using these tools are complex in nature. Since ratings are open to all customers there are good possibilities of junk reviews and spelling mistakes are common on rating content.



Natural language Toolkit (NLTK) – It runs on Python language platform. It provides features like tokenizing, identifying named entities and parsing.

Stanford core NLP Suite – It provides tools for parts of speech tagging, grammar praising and name entity recognition.

GATE and Apache UIMA – It is help in building complex NLP workflows which integrates with different processing steps.




SAS Text Analytics – It provides Text analytics software to extract information from text content. It discover patterns and trends from text using natural language processing, advanced linguistic technologies and advanced statistical modeling.

IBM Text Analytics – It converts unstructured survey text into quantitative data. It automates the categorization process to eliminate the manual processing. To reduce ambiguities in human language it uses linguistic based technologies

Lexalytics Text Analytics – Salience is text analytics engine build by Lexalytics.  It is helpful for social media monitoring, sentiment analysis, survey of customer voice.

Smart logic – It provides rule base classification modelling and information visualization. It applies metadata and classification to deliver navigation and search experience.



Customers today leave pieces of information and data over the Internet – bits of knowledge into who they are, what they like, and what they are going to buy. Furthermore, as most businesses today, the automotive business is assembling and utilizing as quite a bit of this data as they can collect.Of course, not all data is made equivalent. Data might be deficient, unstructured, or out and out off-base. Also, to exacerbate the issue, this data isn’t as a matter of course simple to gather and change into a significant data.


On the off chance that you don’t know who your clients and best prospects are, by what means would you be able to send then messages to get them into your dealership or repair focus? Without a doubt, you may have data on when they last went to your place of business, including a couple contact points of interest here and there. What’s more, maybe you have their charging address on the off chance that they have worked with you some time recently. Be that as it may, as is frequently the case, you are most likely missing various points of interest to help you comprehend your clients on a significantly more customized level.

You might need to know which families have kids might be a great opportunity to move up to a bigger vehicle, who has a teenager driver in the house  or who is occupied with the outside. Points of interest, for example, wage,  status, occupation, distractions, way of life, and age are a few case of demographics that can be utilized to make focused on advertising messages to which your shoppers are most able to relate.

Specialized Auto DATA

A few specific data arrangement suppliers can give  gritty data on vehicles and their proprietors. Search for a data arrangement that incorporates:

  • 100% populated with Make, Model and Year as got specifically from VINs.
  • Completely populated database in which each lead record incorporates data, for example, name, address, make, model and year.
  • Premium chooses, for example, in-business sector for another vehicle, purchaser demographics, fragmented riches demonstrating, email addresses, and full VIN.
  • Choices accessible, for example, motor size, fuel sort, drive train, motor piece, and motor barrels.
  • Approved mails and index help accepted telephone numbers. Automotive showcasing data.


The Modi government in power, there are expectations of increased focus on reforms and ramp up in infrastructure. Thus, government spending on infrastructure in roads and airports and higher GDP growth in the future will benefit the auto sector in general. We expect a slew of launches both in passenger cars and utility vehicles (UVs) given that the competition has intensified.

Our prospect focusing on expands showcasing results by focusing on your business messages and financing offers to prepared, willing and ready to purchase customers . Driving the up and coming era of automotive promoting by consolidating further bits of knowledge into buyer states of mind and practices with significant heading in the utilization of immediate, customary and advanced media, constant lead scoring and publicizing focusing on.

Key Features and Benefits

  • Cross media – direct mail, email, online advertising and more
  • Innovative – the next generation of patented methodologies
  • Affordable – value driven
  • Effective – validation results available.

Individualized,  administration showcasing effort are vital to holding clients and utilizing the most  client esteem for dealerships. This administration showcasing program build deals and benefits by focusing on clients in value, or those toward the end of term, lease or guarantee. Flawless Prospect uses DMS, OEM motivating forces, book qualities and outsider data to offer vehicle merchants an aggressive edge by distinguishing current and triumph open doors that have the most noteworthy likelihood of acquiring or overhauling with a dealership. Tweaked cautions convey prepared to-purchase open doors at the perfect time, to the right partner, in a RO dashboard that is straightforward and straightforward. Transform current administration clients into faithful, rehash deals clients when you get them into another vehicle with a like installment for practically no cash down


“PROSPECTS  DATA  and PLANNING the strategy  makes the clear path to be a successful Automobile Company”




Applications of Data Mining

Service providers

The first example of Data Mining and Business Intelligence comes from service providers in the mobile phone and utilities industries. Mobile phone and utilities companies use Data Mining and Business Intelligence to predict ‘churn’, the terms they use for when a customer leaves their company to get their phone/gas/broadband from another provider. They collate billing information, customer services interactions, website visits and other metrics to give each customer a probability score, then target offers and incentives to customers whom they perceive to be at a higher risk of churning.


Another example of Data Mining and Business Intelligence comes from the retail sector. Retailers segment customers into ‘Recency, Frequency, Monetary’ (RFM) groups and target marketing and promotions to those different groups. A customer who spends little but often and last did so recently will be handled differently to a customer who spent big but only once, and also some time ago. The former may receive a loyalty, upsell and cross-sell offers, whereas the latter may be offered a win-back deal, for instance.


Perhaps some of the most well -known examples of Data Mining and Analytics come from E-commerce sites. Many E-commerce companies use Data Mining and Business Intelligence to offer cross-sells and up-sells through their websites. One of the most famous of these is, of course, Amazon, who use sophisticated mining techniques to drive there, ‘People who viewed that product, also liked this’ functionality.


Supermarkets provide another good example of Data Mining and Business Intelligence in action. Famously, supermarket loyalty card programmes are usually driven mostly, if not solely, by the desire to gather comprehensive data about customers for use in data mining. One notable recent example of this was with the US retailer Target. As part of its Data Mining programme, the company developed rules to predict if their shoppers were likely to be pregnant. By looking at the contents of their customers’ shopping baskets, they could spot customers who they thought were likely to be expecting and begin targeting promotions for nappies (diapers), cotton wool and so on. The prediction was so accurate that Target made the news by sending promotional coupons to families who did not yet realise they were pregnant.

Crime agencies

The use of Data Mining and Business Intelligence is not solely reserved for corporate applications and this is shown in our final example. Beyond corporate applications, crime prevention agencies use analytics and Data Mining to spot trends across myriads of data – helping with everything from where to deploy police manpower (where is crime most likely to happen and when?), who to search at a border crossing (based on age/type of vehicle, number/age of occupants, border crossing history) and even which intelligence to take seriously in counter-terrorism activities.

Text analytics

Text analytics is the process of analyzing unstructured text, extracting relevant information, and transforming it into useful business intelligence. Text analytics processes can be performed manually, but the amount of text-based data available to companies today makes it increasingly important to use intelligent, automated solutions.


Why is text analytics important?

Emails, online reviews, tweets, call center agent notes, and the vast array of other written feedback, all hold insight into customer wants and needs. But only if you can unlock it. Text analytics is the way to extract meaning from this unstructured text, and to uncover patterns and themes.


Several text analytics use cases exist:

  • Case management—for example, insurance claims assessment, healthcare patient records and crime-related interviews and reports
  • Competitor analysis
  • Fault management and field-service optimization
  • Legal ediscovery in litigation cases
  • Media coverage analysis
  • Pharmaceutical drug trial improvement
  • Sentiment analytics
  • Voice of the customer

A well-understood process for text analytics includes the following steps:

  1. Extracting raw text
  2. Tokenizing the text—that is, breaking it down into words and phrases
  3. Detecting term boundaries
  4. Detecting sentence boundaries
  5. Tagging parts of speech—words such as nouns and verbs
  6. Tagging named entities so that they are identified—for example, a person, a company, a place, a gene, a disease, a product and so on
  7. Parsing—for example, extracting facts and entities from the tagged text
  8. Extracting knowledge to understand concepts such as a personal injury within an accident claim


Qualitative Technique

An array of techniques may be employed to derive meaning from text. The most accurate method is an intelligent, trained human being reading the text and interpreting its meaning. This is the slowest method and the most costly, but the most accurate and powerful. Ideally, the reader is trained in qualitative research techniques and understands the industry and contextual framework of the text. A well-trained qualitative researcher can extract extraordinary understanding and insight from text. In a typical project, the qualitative researcher might read hundreds of paragraphs to analyze the text, develop hypotheses, draw conclusions, and write a report. This type of analysis is subject to the risks of bias and misinterpretation on the part of the qualitative researcher, but these limitations are with us always—regardless of method. The power of the human mind cannot be equaled by any software or any computer system. Decision Analyst’s team of highly trained qualitative researchers are experts at understanding text.

Content Analysis or Open-End Coding

The history of text analytics traces back to World War II and the development of “content analysis” by governmental intelligence services. That is, intelligence analysts would read documents, magazines, records, dispatches, etc., and assign numeric codes to different topics, concepts, or ideas. By summing up these numeric codes, the analyst could quantify the different concepts or ideas, and track them over time. This approach was further developed by the survey research industry after the war. Today as then, open-end questions in surveys are analyzed by someone reading the textual answers and assigning numeric codes. These codes are then summarized in tables, so that the analyst has a quantitative sense of what people are saying. This remains a powerful method of text mining or text analytics. It leverages the power of the human mind to discern subtleties and context.


The first step is careful selection of a representative sample of respondents or responses. In surveys the sample is usually representative and comparatively small (less than 2,000), so all open-ended questions are coded. However, in the case of social media text, CRM system, or customer complaint system, the text might be made up of millions of customer comments. So the first step is the random selection of a few thousand records, and these records are checked for duplicates, geographic distribution, etc. Then, a human being reads each and every paragraph of text and assigns numeric codes to different meanings and ideas. These codes are tabulated and statistical summaries are prepared for the analyst. This is text mining or text analytics at its apogee. Open-end coding offers the strength of numbers (statistical significance) and the intelligence of the human mind. Decision Analyst operates a large multilanguage coding facility with highly trained staff specifically for content analysis and text analytics.


Machine Text Mining or Text Analytics

With the explosion of keyboard-generated text related to the spread of PCs and the Internet over the past two decades, many companies are searching for automated ways to analyze large volumes of textual data. Decision Analyst offers several text-analytic services, based on different software systems, to analyze and report on textual data. These software systems are very powerful, but they cannot take the place of the thinking human brain. The results from these software systems should be thought of as approximations, as crude indicators of truth and trends, but the results must always be verified by other methods and other data.


Business intelligence


Business intelligence(BI) can be described as “a set of techniques and tools for the acquisition and transformation of raw data into meaningful and useful information for business analysis purposes”. The term “data surfacing” is also more often associated with BI functionality. BI technologies are capable of handling large amounts of structured and sometimes unstructured data to help identify, develop and otherwise create new strategic business opportunities. The goal of BI is to allow for the easy interpretation of these large volumes of data. Identifying new opportunities and implementing an effective strategy based on insights can provide businesses with a competitive market advantage and long-term stability.BI technologies provide historical, current and predictive views of business operations. Common functions of business intelligence technologies are reporting, online analytical processing, analytics, data mining, process mining, complex event processing, business performance management, benchmarking, text mining, predictive analytics and prescriptive analytics.



Business intelligence is made up of an increasing number of components including

*Multidimensional aggregation and allocation

*De-normalization, tagging and standardization

*Real time reporting with analytical alert

*A method of interfacing with unstructured data sources

*Group consolidation, budgeting and rolling forecasts

*Statistical inference and probabilistic simulation

*Key performance indicators optimization

*Version control and process management

*Open item management.


Data warehousing:

Often BI applications use data gathered from a data warehouse(DW) or from a data mart, and the concepts of BI and DW sometimes combine as “BI/DW” or as “BIDW”. A data warehouse contains a copy of analytical data that facilitates decision support. To distinguish between the concepts of business intelligence and data warehouses, Forrester Research defines business intelligence in one of two ways:

1.Using a broad definition “Business Intelligence is a set of methodologies, processes, architectures, and technologies that transform raw data into meaningful and useful information used to enable more effective strategic, tactical, and operational insights and decision-making.”Under this definition, business intelligence also includes technologies such as data integration, data quality, data warehousing, master-data management, text- and content-analytics, and many others that the market sometimes lumps in to the “Information Management” segment. Therefore, Forrester refers to data preparation and data usage as two separate but closely linked segments of the business-intelligence architectural stack.

2.Forrester defines the narrower business-intelligence market as, referring to just the top layers of the BI architectural stack such as reporting, analytics and dash boards.

Comparison with competitive intelligence:

BI uses technologies, processes, and applications to analyze mostly internal, structured data and business processes while competitive intelligence gathers, analyzes and disseminates information with a topical focus on company competitors. If understood broadly, business intelligence can include the subset of competitive intelligence.


Business Intelligence Trends:

Currently organizations are starting to see that data and content should not be considered separate aspects of information management, but instead should be managed in an integrated enterprise approach. Enterprise information management brings Business Intelligence and Enterprise Content Management together. Currently organizations are moving towards Operational Business Intelligence which is currently under served and uncontested by vendors. Traditionally, Business Intelligence vendors are targeting only top the pyramid but now there is a paradigm shift moving toward taking Business Intelligence to the bottom of the pyramid with a focus of self-service business intelligence.

Business needs:

Because of the close relationship with senior management, another critical thing that must be assessed before the project begins is whether or not there is a business need and whether there is a clear business benefit by doing the implementation. Another reason for a business-driven approach to implementation of BI is the acquisition of other organizations that enlarge the original organization it can sometimes be beneficial to implement DW or BI in order to create more oversight.

BI Portals:

A Business Intelligence portal(BI portal) is the primary access interface for Data Warehouse(DW) and Business Intelligence (BI) applications. The BI portal is the user’s first impression of the DW/BI system. It is typically a browser application, from which the user has access to all the individual services of the DW/BI system, reports and other analytical functionality. The following is a list of desirable features for web portals in general and BI portals in particular: Usable User should easily find what they need in the BI tool. Content Rich The portal is not just a report printing tool, it should contain more functionality such as advice, help, support information and documentation. The portal should be implemented in a way that makes it easy for the user to use its functionality and encourage them to use the portal. Scalability and customization give the user the means to fit the portal to each user.




Big data analytics is a trending practice that many companies are adopting. Before jumping in and buying big data tools, though, organizations should first get to know the landscape.

In essence, big data analytics tools are software products that support predictive and prescriptive analytics applications running on big data computing platforms typically, parallel processing systems based on clusters of commodity servers, scalable distributed storage and technologies such as Hadoop and NoSQL databases..

In addition, big data analytics tools provide the framework for using data mining techniques to analyze data, discover patterns, propose analytical models to recognize and react to identified patterns, and then enhance the performance of business processes by embedding the analytical models within the corresponding operational applications.


Powering analytics: Inside big data and advanced analytics tools

Big data analytics yields a long list of vendors. However, many of these vendors provide big data platforms and tools that support the analytics process for example, data integration, data preparation and other types of data management software. We focus on tools that meet the following criteria:

  • They provide the analyst with advanced analytics algorithms and models.
  • They’re engineered to run on big data platforms such as Hadoop or specialty high-performance analytics systems.
  • They’re easily adaptable to use structured and unstructured data from multiple sources.
  • Their performance is capable of scaling as more data is incorporated into analytical models.
  • Their analytical models can be or already are integrated with data visualization and presentation tools.
  • They can easily be integrated with other technologies.


Big data and advanced analytics tools:

While some individuals in the organization are looking to explore and devise new predictive models, others look to embed these models within their business processes, and still others will want to understand the overall impact that these tools will have on the business. In other words, organizations that are adopting big data analytics need to accommodate a variety of user types, such as:

The data scientist, who likely performs more complex analyses involving more complex data types and is familiar with how underlying models are designed and implemented to assess inherent dependencies or biases.

The business analyst, who is likely a more casual user looking to use the tools for proactive data discovery or visualization of existing information, as well as some predictive analytics.

The business manager, who is looking to understand the models and conclusions.

IT developers, who support all the prior categories of users.


How Applications of Big Data Drive Industries:

Generally, most organizations have several goals for adopting big data projects. While the primary goal for most organizations is to enhance customer experience, other goals include cost reduction, better targeted marketing and making existing processes more efficient. In recent times, data breaches have also made enhanced security an important goal that big data projects seek to incorporate. More importantly however, where do you stand when it comes to big data? You will very likely find that you are either:

  • Trying to decide whether there is true value in big data or not
  • Evaluating the size of the market opportunity
  • Developing new services and products that will utilize big data
  • Already utilizing big data solutions Repositioning existing services and products to utilize big data, or
  • Already utilizing big data solutions.

With this in mind, having a bird’s eye view of big data and its application in different industries will help you better appreciate what your role is or what it is likely to be in the future, in your industry or across different industries. With this in mind, having a bird’s eye view of big data and its application in different industries will help you better appreciate what your role is or what it is likely to be in the future, in your industry or across different industries.

In this article, I shall examine 10 industry verticals that are using big data, industry-specific challenges that these industries face, and how big data solves these challenges.



Having gone through 10 industry verticals including how big data plays a role in these industries, here are a few key takeaways:

  • There is substantial real spending around big data
  • To capitalize on big data opportunities, you need to Familiarize yourself with and Understand where spending is occurring
  • Match market needs with your own capabilities and solutions
  • Vertical industry expertise is key to utilizing big data effectively and efficiently
  • If there’s anything you’d like to add, explore, or know, do feel free to comment below.