What is Case Study?
Case Study is an excellent method of learning, where the practical aspects of the subjects are involved. A good case study should comprise of an appropriate introduction to the case followed by identification of the problem and concrete analysis of the problem.
Meaning of a Case Study
A case study is a research strategy and an empirical inquiry that investigates a phenomenon within its real-life context. It provides the opportunity for learner to demonstrate independence and originality, to plan and organize the case analysis and to put into practice some of the legal aspects, learners have been taught throughout the program.
Technical specifications of the Submissions
Font: Times New Roman, font size 12, spacing 1.5
Margin: Left 35 mm, Right 20mm, Top 35mm, Bottom 20mm
First (preliminary) page should have the following information:
Top: The title in block and capital (uppercase).
Centre: Full name of the student in capital letters and registration number.
Full name of the Guide (If any) in block letters, with the designation.
Bottom: Name of the Institute (i.e. SCDL), in block and capital with the academic year.
How to prepare a case study
Provide a context for the case and describe any similar cases previously reported. Here you can give problem statement in brief.
Several sentences describe the history and results of any examinations performed. The working diagnosis and management of the case are described.
Describe the essential nature of the problem.
Further development of solution of the problem.
Summarize the results of examination.
Explain the importance of various variables/features in the dataset
Management and Outcome
Simply describe the course of the complaint. Where possible, make reference to any outcome measures which you used to objectively demonstrate.
Describe as specifically as possible the solution provided, including the nature of the formula.
If possible, refer to objective measures of the solution progress.
Describe the resolution of solution.
Choose Topics From Below
Some examples include solving common business problems in Marketing, Sales, Customer Clustering, Banking, Real Estate, Insurance, Travel and many more. Please choose a dataset for doing the submission in 2 parts:
1) The first part will consist of EDA
The use case and dataset for both these submissions should be the same. Also, please keep in mind that the solution set for semester 2 includes submitting the first part and for semester 4 you have to submit the ML application use case and code.
2) Data Science at Flipkart[Recommendation engine for products]
3) Case Study of Data Science at Facebook
4) Customer Analytics at Flipkart.Com [Clustering users based on their interests till date]
The students are required to submit an individual ‘Submission I’ as per the guidelines given. The preparation of Case Study Submission I Report as group activity or copied will not be considered for further evaluation.
Please note that a plagiarism check is mandatory before submitting the final soft copy of the Submission I Report to SCDL for evaluation. In simple terms, plagiarism is copy-paste. Students can use the Free and Open Source Software (FOSS) available online for checking the same (For e.g. http://plagiarisma.net/). As per the UGC guidelines, similarity index should be less than 10%. The plagiarism report should be attached with the Submission I without which the case study report will not be evaluated.
Sample Solution For Movie Success Rate Prediction
1. Problem Statement
Here our big data research paper related to evaluate the compatibility of movie success rate with their corresponding success variable. Recent time lots of information share through internet and social media such as, entertainment, news, and business related and so on. There are lots of movie’s release and produce in every month or year all over the word. Here some of movies is success and some of are not success. Many people watching movies through multiplex or online social media portals like Amazon Prime, Disney Hot star, and more other. Success rate of movies is important because in movie huge amount of money is invest to making the movie. Director invest huge amount of money to make the movie better so he can earn the money. All production team are worried about to success rate of movies. Movie industries always worries about movies success rate. Here in our paper work machine learning Unsupervised techniques used to predict the movie success rate. In this we have used data from Kaggle which is free and open source which is public and used by any researchers easily provide by Internet Movie Database. The dataset contains both numeric and categorical variables such as rating, director, actor/actress, budget, genre, title, running time, MPAA ratings, awards, etc. In this paper we used the one Supervised Machine Learning Algorithms such as Logistic Regression. And last we find the accuracy of the model using precision, recall , and accuracy.
Recent time large number of movies released in every year and it is the good source of the watching and the entertain. Prediction of movies is deciding the success rate of movies. Film industry related to Hollywood or Bollywood growth in last 15 year and reach to peak to earn money through online and offline.In every year different kinds of movies are released. In this some of movies are affect the people and inspire to success in like which is related to motivation and historical data. Main objective of each movie maker to gain profit to earning point and make it popular to view point. To predict the success of each movie is very difficult for film industry. The success of movie is depending on many features like songs, story, title, genre, rating, movie actors, and graphics, etc. The “Fight Club” was very famous movie but it earns huge profit. As per some movies analyst it earns approx. 25 percent gain in first two weeks which is low as per investment in movie. Sometimes people are confused to select the movie for look for. To handling this situation, such type of machine learning techniques used to decide the best and good movie. The recent machine learning prediction could also help to investor and movie producers to choosing new movie, actor, and actress wisely for future investment. Now a days there are many websites and sources are available which contains various information about movies and others entertainment programs. Internet Movie Database is the good collection of large movies dataset which is launched on October 1990. Here the lots of datasets available which is related to database of TV programs, Hollywood movies, Bollywood Movies, games. Internet Movie Database offer several features and streaming associated with movies, rating, production crew, reviews, images, runtime, quotes, videos, and more. Internet Movie Database has approximately more than 6 million titles and 10 million personalities in it database.
3. Introduction to Dataset
The dataset “movie_success_rate.csv” focused on seeking diverse representation while posing for information starting from technologies and behaviour to questions which will help them improve and predict the dataset which need to analyze the movie dataset. For nearly a decade, recent time it being the most important in the world due to people interest in film industry. This dataset has 839 rows and 33 columns. Below we have described all the related features which is used to predict the movie_success_rate.
Shape Of dataset: (839, 33)
As per above dataset shape, we can say that it has 839 row and 33 features columns. As per above table data has different data type. Some of its columns are in float data type (Rank, Actors, Year, Runtime, Rating, Votes, Revenue, Metascore, Action, Adventure, Aniimation, Biography, Comedy, Crime, Drama, Family, Fantasy, History, Horror, Music, Musical, Mystery, Romance, Sci-Fi , Sport, Thriller, War, Western, Success) and some of these are Object data type(Title, Genre, Description, Director).
4. Data Pre-Pre-processing & Summary
4.1 Data Pre-Processing:
Any data or real-world data generally contains many issues like noises, missing values, and not given in proper format which cannot be directly used for machine learning algorithms. This is the process for cleaning the data and making it suitable for a ML model to increase the model efficiency and increase the accuracy of the model also.
➢ Data pre-processing is a main and first step that helps enhance the quality of data to promote the extraction of meaningful insights from the data. Data pre-processing in Machine Learning refers to the technique of preparing (cleaning and organizing) the raw data to make it suitable for building and training Machine Learning models.
➢ Make a list for Data and target. Checking null value if exist then need to fill it.
In our paper task we have used some steps for pre-processing the data:
Checking the Null Values: Below the code which used to checked the null value in dataset, in movie_success_rate dataset we have not find any missing values.
#Checking Null Value #Visualize for check null value check_null_value = df.isnull() sns.heatmap(check_null_value,yticklabels=False,cbar=False,cmap='viriis')
As per above heat map we have clearly understand that in dataset no missing or Nan values.
4.2 Summary Statistics
Summary Statistics is summarizing the data at hand through certain numbers like mean, std etc. so it makes the data easier to understand. It used find statistic information of all numeric In machine learning we have use simple predefined function describe() to show all summary related to dataset.
#– summary of the dataset df.describe()
Count: It used to count the all-feature values of dataset columns
Mean: This is the statistic term which used to find the mean of each numeric feature columns in dataset
Std: It means standard deviation; this is also statistic term to find the standard deviation of dataset features.
Min: find the min values
Max: find the max values
5. Feature Selection
This is the next steps after pre-process the dataset. In big data machine learning feature selection is the process of reducing the number of input variables when developing a predictive model.
This is basically used to reduce the input variables to reduce the computational cost of modelling and some cases it used to increase the performance of the model. In this we choose some variable which is useful and remove some features which is not useful to predict the model.
In machine learning it used to evaluating the relationship between each input variable and the target variable using statistics and selecting those input variables that have the strongest relationship with the target variable. It makes the model fast and efficient and give the more accurate result. In machine learning features can be select both manually or using algorithms. In our Big data task here we select some features manually for both features and target.
See in below code:
#selecting features and target variable and leave all unnecessary variables x = df[['Year', 'Runtime (Minutes)', 'Rating', 'Votes', 'Revenue (Millions)', 'Metascore', 'Action', 'Adventure', 'Aniimation', 'Biography', 'Comedy', 'Crime', 'Drama', 'Family', 'Fantasy', 'History', 'Horror', 'Music', 'Musical', 'Mystery', 'Romance', 'Sci-Fi', 'Sport', 'Thriller', 'War', 'Western']] y = df['Success']
In above code x is a feature variable and y is the target variable which we choose manually from our movie_success_rate dataset.
6. Exploratory Data Analysis (EDA)
In this we have to use some visualization to show and understand the features and their relationship with other features easily.
It is divided into two categories:
– Univariate Analysis: histogram, distribution (distplot, boxplot, violin)
– Multivariate Analysis: scatter plot, pair plot, etc
# visualize frequency distribution of `Rating` variable f, ax = plt.subplots(figsize=(12, 14)) ax = sns.countplot(x="Rating", data=df, palette="Set1") ax.set_title("Frequency distribution of Rating variable") ax.set_xticklabels(df.Rating.value_counts().index, rotation=60) plt.show()
If you need any Machine Learning Case Study Help, R programming Case Study Help or Data Science Case Study Help or need any research paper implementation help.
Send your request at email@example.com and get instant help with an affordable price.
We are always focus to delivered unique or without plagiarism code which is written by our highly educated professional which provide well structured code within your given time frame.
If you are looking other programming language help like C, C++, Java, Python, PHP, Asp.Net, NodeJs, ReactJs, etc. with the different types of databases like MySQL, MongoDB, SQL Server, Oracle, etc. then also contact us.