Week 1
主要记住的知识点:
1.Regression & Classification
1.回归问题的应用场景
回归问题通常是用来预测一个值,如预测房价、未来的天气情况等等,例如一个产品的实际价格为500元,通过回归分析预测值为499元,我们认为这是一个比较好的回归分析。一个比较常见的回归算法是线性回归算法(LR)。另外,回归分析用在神经网络上,其最上层是不需要加上softmax函数的,而是直接对前一层累加即可。回归是对真实值的一种逼近预测。
2.分类问题的应用场景
分类问题是用于将事物打上一个标签,通常结果为离散值。例如判断一幅图片上的动物是一只猫还是一只狗,分类通常是建立在回归之上,分类的最后一层通常要使用softmax函数进行判断其所属类别。分类并没有逼近的概念,最终正确结果只有一个,错误的就是错误的,不会有相近的概念。最常见的分类方法是逻辑回归,或者叫逻辑分类。
2.supervised learning & unsupervised learning
个人觉得最好分辨的是,监督学习是确定了数据集中一定存在某种关系,而另一种是去探究它们有没有关系。
3.Machine learning definition
第一周包含的话题有以下几个
以下是week 1 的测试题目:
Quiz Introduction
1.
A computer program is said to learn from experience E with respect to some task T and some performance measure P if its performance on T, as measured by P, improves with experience E. Suppose we feed a learning algorithm a lot of historical weather data, and have it learn to predict weather. What would be a reasonable choice for P?
(x) The probability of it correctly predicting a future date's weather.
( ) The weather prediction task.
( ) The process of the algorithm examining a large amount of historical weather data.
( ) None of these.
2.
Suppose you are working on weather prediction, and your weather station makes one of three predictions for each day's weather: Sunny, Cloudy or Rainy. You'd like to use a learning algorithm to predict tomorrow's weather.
Would you treat this as a classification or a regression problem?
(x) Classification
( ) Regression
3.
Suppose you are working on stock market prediction, Typically tens of millions of shares of Microsoft stock are traded (i.e., bought/sold) each day. You would like to predict the number of Microsoft shares that will be traded tomorrow.
Would you treat this as a classification or a regression problem?
( ) Classification
(x) Regression
4.
Some of the problems below are best addressed using a supervised learning algorithm, and the others with an unsupervised learning algorithm. Which of the following would you apply supervised learning to? (Select all that apply.) In each case, assume some appropriate dataset is available for your algorithm to learn from.
[ ] Given data on how 1000 medical patients respond to an experimental drug (such as effectiveness of the treatment, side effects, etc.), discover whether there are different categories or "types" of patients in terms of how they respond to the drug, and if so what these categories are.
[x] Have a computer examine an audio clip of a piece of music, and classify whether or not there are vocals (i.e., a human voice singing) in that audio clip, or if it is a clip of only musical instruments (and no vocals).
[x] Given genetic (DNA) data from a person, predict the odds of him/her developing diabetes over the next 10 years.
[ ] Given a large dataset of medical records from patients suffering from heart disease, try to learn whether there might be different clusters of such patients for which we might tailor separate treatments.
5.
Which of these is a reasonable definition of machine learning?
( ) Machine learning is the field of allowing robots to act intelligently.
(x) Machine learning is the field of study that gives computers the ability to learn without being explicitly programmed.
( ) Machine learning is the science of programming computers.
( ) Machine learning learns from labeled data.