Topic > Predicting Restaurant Customer Rating

IndexAbstractIntroductionObjective of the projectSpecification of the hypothesisContext of the datasetCritical evaluation of the applicable techniquesImplementation of the chosen techniqueInterpretation of the results - Quantitative results and qualitative interpretationConclusionAbstractThe aim of the project is to discover the relationship between the dependent variable and the independent. Know how all the independent variable influences the dependent variable. The restaurant rating is based on many attributes such as quality of food, premium, ambiance of the restaurant, whether the restaurant has online delivery system, whether the restaurant has table reservation, etc. All these factors will affect the business profit as customers will take these factors into consideration when dining at their favorite restaurant. Therefore, Customer Relationship Management plays an important role in improving the business profit of any organization. Say no to plagiarism. Get a tailor-made essay on "Why Violent Video Games Shouldn't Be Banned"? Get an original essay Keywords: CRM, hypothesis, sentiment analysis, support vector machine IntroductionCustomer relationship management (CRM) plays a vital role in an organization. To be successful in business the organization should have a good relationship with the customer. The organization should also have loyal and long-lasting customers so as to increase the business value and profit of the organization. The main purpose of CRM is to collect all the necessary data related to customers and analyze the data using different data analytics and machine learning techniques. There are several advantages with the analytics result, since the data is customer related, the result of machine learning and data analysis can be used to improve product quality, it helps to manage customer related data, l 'customer interaction, manage customer accounts, find new customers, retain existing customers, from the analyzes we can also find out what exactly the customers' expectations are and the organization can improve the quality of the products. This will in turn increase customer satisfaction and business profit for the organization. Therefore, we can say that CRM can be used to improve customer value and relationship. Project Objective With the advent of e-commerce on the Internet, searches on social media and restaurants have increased enormously. Online reviews on different products, places and restaurants will have a great impact on business profits as customers will search for online reviews and ratings before purchasing a product or dining at a restaurant. Therefore, customer rating plays an important role in business profit. Online restaurant customer reviews and ratings can help improve the quality and standard of the restaurant, thus improving business profit. Restaurant rating is important for online users because it provides an overall rating of the restaurant which includes multiple factors such as quality of food, ambiance, range of rewards, whether it has table reservation, whether it has online delivery, which type of cuisine, location, etc. . There are several online restaurant search websites from which you can acquire data to predict restaurant customer ratings. I chose "Zomato" which is one of the most popular restaurant search sites. These ratings will be useful to users who access the Zomato site online to search for the best restaurants in the city. From the dataset, the customer rating can be classified using multiple other parameters which will be explained in the next section. In thisproject, the rating of the restaurant given by the customer was classified. You can predict customer ratings and business profitability. Specifying the Hypothesis From the Zomato dataset, the following hypothesis can be formed: There are multiple attributes in the dataset, how do all the independent attributes of the dataset influence the dependent variable which is restaurant rating. To be precise, how the restaurant presents itself: "Location", "Cuisine", "Cost", "Table reservation", "Online delivery" have an effect on the "evaluation text". Dataset Background The dataset has the following attributes, for example: Restaurant Name, Restaurant ID, City, Address, Cuisine, Cost for two people, Table Reservation, Online Delivery, Delivery Now, Go to Order Menu , Reward range, Aggregate rating, Rating color, Rating text and grades. The restaurant name will have names of all the restaurants in a location, the restaurant ID will be unique for all the restaurants, the city is for listing all the restaurants in a city, the address will be useful for locating the restaurant in an area , the kitchens have a list of all the products served in the restaurant, the cost of two gives the total amount of money for two people. There are other attributes in the dataset which are used to discover different characteristics of the restaurant. A restaurant may have table reservation, online delivery etc., all these attributes will have high correlation with the dependent variable which is rating. If a restaurant has all the features and the quality of the food is very good, the rating of the restaurant is likely to be high. In other cases, the restaurant may not have all the features but it may be that the food quality is good and the overall premium may be less, so customers would prefer such restaurants and there are chances that the rating is high for such restaurants. Therefore, all these attributes together will determine the rating of the restaurant given by the customer. The rating will help other users who access the Zomato website, so the better the rating of the restaurant, the higher the business profit for the restaurant. The aggregate rating is a numerical value on a scale of one to five, with one being the lowest and five being the highest. The rating text is coded as excellent, very good, good, okay. For example, if the restaurant has an overall rating of 4.8, it will be coded as excellent in the rating text. The rating text will be the dependent variable as it is a categorical variable, while the aggregate rating is a continuous numeric value. Critical evaluation of applicable techniques There are various methods to find restaurant ratings using machine learning techniques. These ratings will help users of the Zomato website choose the best restaurant to dine at. Sentiment analysis was used to find the rating of the restaurant. Here, the sentiment score will automatically rank the restaurant rating to help users or customers choose their best restaurant. The sentiment score can be calculated based on user reviews, keywords will have ratings associated with them which will be assigned a sentiment score. It will be helpful to find out the tone behind the user. The process can be explained as follows. The dataset here is taken from Yelp site, about 100,324 reviews are taken for 2000 restaurants. The reviews contain many words like good, bad, excellent, wonderful, amazing, wonderful, horrible, terrible etc., from all these words the sentiment score is calculated. The process for sentiment analysis is described byfollowed: First the reviews are split into separate opinion words, there will be a text file consisting of positive and negative words and each word will have a corresponding opinion word as seen in the table above, then the final opinion score will be calculated. After the opinion words are calculated, the emoticons are identified. Let's take an example of two sentences to understand the identification of the emoticons “the food was GREAT” and “the food was excellent”, the first sentence will have a higher score than in sentence two, because the user predictably mentioned the positive review. the sentiment score is calculated by averaging all positive and negative scores. The neutral score is also calculated. The sum of all scores will give the sentiment score. Finally, the rating is calculated. The following hypothesis can be made: If the customer likes the food, the rating of the restaurant will be better. The restaurant environment plays an important role in the rating of a restaurant. The rating will be higher if the Restaurant service is good. The price of the restaurant also plays an important role in the rating of the restaurant. There are other techniques for implementing sentiment analysis. We can see that techniques such as Naïve Bayes, Support Vector Machines, Decision tree, K-Nearest Neighbor Classifier, Winnow Classifier, Adaboost Classifier are used. Implementing your chosen technique The dataset downloaded from Kaggle needs to be preprocessed before applying any machine learning algorithm to because the dataset will contain unwanted noise, missing values, null values ​​and special characters. There were many unwanted rows and columns in the dataset such as country code, detailed location, latitude, longitude and currency. These attributes are removed before applying machine learning since these independent variables do not have much effect on the dependent variable. There were other missing and null values ​​in the dataset which was cleaned in R using the gsub function and was manually removed from Excel. From the following output obtained from R studio we can find the correlation between the different attributes obtained from the dataset. Correlation can be classified into different types, strongly correlated, no correlation and neutral correlation. Attributes that have 0 have no correlation. Attributes from 0-0.5 have a neutral correlation. Attributes from 0.5-1 are highly correlated. From the above we take an example of a strongly correlated attribute, "Award range" and "has table reservation" are strongly correlated. “Has online delivery” and “Has table reservation” have a neutral correlation with each other. There are several techniques to classify restaurant rating using machine learning algorithms. Rating scores can be calculated using the sentiment analysis we looked at earlier. Support vector machine for classification for restaurant rating has been implemented, the method implemented for this project has additional attributes to improve the classification accuracy rate of restaurant rating. Additional attributes include seeing if the restaurant has booked a table, has online delivery, is delivering now, jump to menu option, food award in the restaurant, total number of votes from the customer, location of the restaurant. These are all the independent variables that will have an effect on the dependent variable restaurant rating. Now let's see why Support Vector Machine is used to classify restaurant rating compared to other machine learning techniques: Support Vector Machine is used for supervised machine learning. Can be.