Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
hackassin
GitHub Repository: hackassin/Coursera-Machine-Learning
Path: blob/master/Week 7/Programming Assignment - 6/ex6/emailFeatures.m
863 views
1
function x = emailFeatures(word_indices)
2
%EMAILFEATURES takes in a word_indices vector and produces a feature vector
3
%from the word indices
4
% x = EMAILFEATURES(word_indices) takes in a word_indices vector and
5
% produces a feature vector from the word indices.
6
7
% Total number of words in the dictionary
8
n = 1899;
9
10
% You need to return the following variables correctly.
11
x = zeros(n, 1);
12
13
% ====================== YOUR CODE HERE ======================
14
% Instructions: Fill in this function to return a feature vector for the
15
% given email (word_indices). To help make it easier to
16
% process the emails, we have have already pre-processed each
17
% email and converted each word in the email into an index in
18
% a fixed dictionary (of 1899 words). The variable
19
% word_indices contains the list of indices of the words
20
% which occur in one email.
21
%
22
% Concretely, if an email has the text:
23
%
24
% The quick brown fox jumped over the lazy dog.
25
%
26
% Then, the word_indices vector for this text might look
27
% like:
28
%
29
% 60 100 33 44 10 53 60 58 5
30
%
31
% where, we have mapped each word onto a number, for example:
32
%
33
% the -- 60
34
% quick -- 100
35
% ...
36
%
37
% (note: the above numbers are just an example and are not the
38
% actual mappings).
39
%
40
% Your task is take one such word_indices vector and construct
41
% a binary feature vector that indicates whether a particular
42
% word occurs in the email. That is, x(i) = 1 when word i
43
% is present in the email. Concretely, if the word 'the' (say,
44
% index 60) appears in the email, then x(60) = 1. The feature
45
% vector should look like:
46
%
47
% x = [ 0 0 0 0 1 0 0 0 ... 0 0 0 0 1 ... 0 0 0 1 0 ..];
48
%
49
%
50
51
52
53
54
55
56
57
58
% =========================================================================
59
60
61
end
62
63