ISBN-13: 9783031146367 / Angielski / Miękka / 2023
ISBN-13: 9783031146367 / Angielski / Miękka / 2023
The power of data drives the digital economy of the 21st century. It has been argued that data is as vital a resource as oil was during the industrial revolution. An upward trend in the number of research publications using machine learning in some of the top journals in combination with an increasing number of academic recruiters within psychology asking for Python knowledge from applicants indicates a growing demand for these skills in the market.While there are plenty of books covering data science, rarely, if ever, books in the market address the need of social science students with no computer science background. They are typically written by engineers or computer scientists for people of their discipline. As a result, often such books are filled with technical jargon and examples irrelevant to psychological studies or projects. In contrast, this book was written by a psychologist in a simple, easy-to-understand way that is brief and accessible. The aim for this book was to make the learning experience on this topic as smooth as possible for psychology students/researchers with no background in programming or data science.Completing this book will also open up an enormous amount of possibilities for quantitative researchers in psychological science, as it will enable them to explore newer types of research questions.
SECTION 1: The Age of Data-Science
In this section, readers will learn about the theoretical background of data science, such as what it is (what it is not), how it is different from machine learning and the rest on this line. More specifically, students will get introduced to the "right" information on what is what and what should they expect.
Chapter 1: What is Data-Science?
Chapter 2: Why should we care to invest in this book today?
Chapter 3: What a typical Data-Science Project looks like?
Chapter 4: What are the pre-requisites, costs, and challenges?
SECTION 2: How to implement Analytics in Academic Research?
Chapter 1: How to persuade my evil funder to invest in this?
Chapter 2: How to migrate from Statistics to Data-Analytics?
Chapter 3: How to design a Data-Science based research project?
Chapter 4: Case studies from Mental Health Research
Exploration of patients’ lived experiences
Forecast the patients’ symptoms
Predict the past trauma of patients
Chapter 5: I am a Healthcare Professional – how to apply analytics to my regular consultancy practice with patients?
Chapter 6: How to get these data?
Chapter 7: FAQs
SECTION 3: How to implement Analytics in Business?
Chapter 1: How to persuade my thrifty boss to invest in this?
Chapter 2: Asking the right questions in the HR department: How to analyse people who work for you?
· Workforce planning: Whom and by when you need?
· Finding the talents from the crowd
· Acquiring the talents
· Onboarding
· Engaging the talent
· Assess the performance
· Retaining the talent
Chapter 3: Asking the right questions in the Marketing department: How to analyse people who buy from me?
· Who buys my product and why?
· How to design and run successful Marketing Experiments: How much you have to cut prices to drive the most sales? Or which advertisement copy is more effective in customer conversion?
Chapter 4: How to get these data?
Chapter 5: FAQs
SECTION 4: GETTING TO KNOW THE TOOLS: With the four chapters included in this section, students will cover enough of the basic ideas to support much of the programming techniques we'll need in the Data-Science. Students will also learn the basics of KNIME and why both Python and KNIME are "great" tools of choice for this. It will start with how to install Python and KNIME? Students will get oriented to what they are getting into, how that translates into the real-world applications, what will we cover in this book (and what we won't) and what will students be able to do by the end of this book.
We will be using Python 3 and KNIME in this book because both of these are industry standards, meaning, employers use it for their projects. The rationale for using both KNIME and Python is based on the fact that most of the target students come from a non-technical background with no experience in coding. So, KNIME provides a nice codeless way to learn data-science, build confidence and understand the concepts firmly. For many students, employers seek python coding knowledge for data analysis positions. So, this book caters to that need of students, enabling them to transfer the lessons learnt on KNIME into python 3.
Chapter 1: Getting started with PythonUnder this chapter, students will get an introduction to Python and programming in general, along with the motivation on why they should care to learn Python at all?
Additionally, this chapter will demonstrate how to install Python for the first time. After that, students will learn to do basic calculations with "operators" and "variables".
To sum up, in this chapter, students will learn the following:
1. Why learn a programming language?
2. How to install Python 3?
3. How to write your first ever computer program?
4. How to do basic calculations?
5. What are the variables and operators?
Chapter 2: Strings, lists, and tuples in PythonIn this chapter, we will describe some of the basic Python data types, such as strings, lists, and tuples; and dictionaries.
More elaborately, students will learn:
In this chapter, students will learn about the variations of conditions, how to use IF statements. Thereafter, we will progress into covering the If-Else-Elif Statement. Finally, in this same chapter, we will cover how to create for loops and while loops.
Chapter 4: Creating function, class and objects with PythonIn this chapter, we will use and create functions and cover how to create classes and objects in Python.
Chapter 5: Getting started with KNIME
SECTION 5: Data-Pre-processing
In this section, students will learn what data is and where to get their data from in chapter 1 and 2. Students will also discover that Data-Science Projects are not just about building machine learning models. However, about 80% of their time and effort will be spent on preparing the data for analysis. In chapter 3 and 4, students will learn about what is data-pre-processing, the need for it, and the rest on the line. This involves three main steps: Data cleaning, Data transformation, and Data reduction.
In this first step (chapter 5), we would learn how to make sure that the data is clean. There might be a lot of 'NAs', or missing data, and outliers, which will need cleansing and formatting. Other times, there can be erroneous values (e.g. age = 300 years). It is vital to clean these missing and inaccurate data at this stage. Students will learn how to spot them and different techniques (such as imputation) to clean the data. Skipping this step can have a significant negative impact on the final output and give inaccurate results.
In the second step (chapter 6), students will learn how to transform the data into a format or shape that is understood and interpreted accurately by the computer. Students will also learn the techniques to do data-transformation (i.e. normalisation, attribute selection and discretisation).
In the third step of data-preprocessing (chapter 7), students will learn how to reduce the data, for analysing big data can be needlessly time-consuming, confusing, counter-productive, and even distracting from what matters the most. In such a scenario, we can conduct data reduction. Data reduction simply is the process of reducing the data size by reducing the categories/variable and make the data more comprehensible. In this chapter, students will learn about feature selection and extraction with dimensionality reduction, and numerosity reduction – different techniques for reducing the data.
Finally, students will learn how to visualise data and spot the error using basic data visualisation techniques, such as box and whisker plot.
SECTION 7: Web Scraping.
This section is a bonus part where students will learn how to use free online tools (no Python or Knime) and scrap the data from the desired website. Students will also learn about the legalities of web-scraping in this section.
SECTION 8: Supervised Machine Learning
In this section, students will get introduced to what is supervised machine learning, what are the two major types of tasks: classification tasks and regression tasks.
Students will learn classification task, that is, how to program and draw a conclusion (in terms of Yes/No, True/False, Spam/Not-Spam, and so on) from the processed data and then use that trained model (program) to determine to what category new inputs belong.
Students will also learn how to predict a value from a series of other changing variables. This is called a regression task. Students will learn each of these tasks using real-world examples. We will also discuss the subtle implication-oriented differences between how a regression task is different from a classification task. We shall also clarify the terminological overlaps (a source of common confusion among beginners) as we go along. For example, logistic regression (even though it has a "regression" in its name) is used for classification (instead of a regression task).
Using KNIME:
Students will learn the classification task. They will learn how to build Probabilistic Neural Network Model (example problem: which hire will be more productive?), Regression Models: logistic and linear (example problem: which employees will perform better?), and Decision-Tree Model (example problem: which employees will leave/quit the job?).
Using Python:
Students will learn regression tasks, such as, how to predict rental income for linear regression problem (chapter 1 and 2). We will also learn how to predict whether a borrower will default on the loan for the classification problem (chapter 4).
Building a regression model that can predict outcomes is only a part of the problem, until and unless we can deploy it online – for everyone else to use it for their purpose – it remains incomplete. So, in chapter 3, we will learn how to deploy a functional regression model as a web application.
SECTION 6: Unsupervised Machine Learning
In this section, students will learn what unsupervised machine learning is, what type of data you need to perform this kind of work (and how is that different from the dataset required for supervised machine learning?). After that, students will learn how to do K-Means Clustering (chapter 1) and Hierarchical Clustering (chapter 2), and what is the difference between them. Furthermore, students will learn, how at times, when they do not have a labelled dataset, but they need to do supervised machine learning task - unsupervised machine learning can help discover the patterns within the data and learn the appropriate labels for the dataset. In other words, students will learn how unsupervised learning can be used as a precursor to performing supervised machine learning in some cases.
SECTION 7: Network Analysis
In this section, students will learn how to visualise data in terms of their relationship between them by constructing network graphs. Students will also learn the different types of charts and customisation features available – all using Python (chapter 1 and 2). Analysis of the network graph is a booming domain in social network analysis. It will be discussed here in terms of human resource management and stock market analysis. We will also learn how to use an open-source platform called Gephi to convert a networkx file into HTML format (chapter 3) and then implement it online as a web application (chapter 4).
SECTION 8: Map
Dr Chandril Ghosh is a UK-based chartered psychologist and is currently working as the Lecturer in Clinical/Counselling Psychology at the Bath Spa University. He completed his BSc in Psychology (honours) and MSc in Clinical Psychology from India. After completing his MSc, Ghosh began to study machine learning and python programming through books and online materials on the subject. He had no background or prior experience with coding or computer science back then. During his doctoral studies, he utilised his knowledge on the subject to employ machine learning techniques to explore psychopathology. Around the same time, he was hired multiple times to design and deliver a crash course on python 3 and machine learning for postgraduate students at the Queen’s University Belfast. Furthermore, he also runs online courses on the subject outside the University, and gets students from about 56 countries. This book is a product of an accumulation of his hundreds of hours of teaching and feedback from students with social science backgrounds.
The power of data drives the digital economy of the 21st century. It has been argued that data is as vital a resource as oil was during the industrial revolution. An upward trend in the number of research publications using machine learning in some of the top journals in combination with an increasing number of academic recruiters within psychology asking for Python knowledge from applicants indicates a growing demand for these skills in the market.
While there are plenty of books covering data science, rarely, if ever, books in the market address the need of social science students with no computer science background. They are typically written by engineers or computer scientists for people of their discipline. As a result, often such books are filled with technical jargon and examples irrelevant to psychological studies or projects. In contrast, this book was written by a psychologist in a simple, easy-to-understand way that is brief and accessible. The aim for this book was to make the learning experience on this topic as smooth as possible for psychology students/researchers with no background in programming or data science.
Completing this book will also open up an enormous amount of possibilities for quantitative researchers in psychological science, as it will enable them to explore newer types of research questions.
1997-2024 DolnySlask.com Agencja Internetowa