   
  

  


                  .                   .           Python.      ,   matplotlib, numpy, pandas, sklearn     .     ,       , ,          ,   ,        .    .





 ,  

   



     



    



:

 . ., ..., , , , 

 . ., ..., , , , 

 . ., , ..., ,  , , 


       , ,   ,    ,     ,   .

  ,
  









                  ,  ,     . ,      ,      ,    ,            .

        . ,    ,        .   -        ,            ,         ,        .       ,       ,             .          ,      .     ,    ,     ,            .                 .            ,    ,  ,           :   -,   . ,    ,    , -  ,     ,            -     10 .

        ,         .      . . , . .   . . ,              .            ,  ,  .                .     ,   .

         ,          ,          .         ,         .         : https://www.dropbox.com/s/xtxicveo5lwmu8z/ML_book_ExamplesLabs_v.1.0.zip?dl=0 (https://www.dropbox.com/s/xtxicveo5lwmu8z/ML_book_ExamplesLabs_v.1.0.zip?dl=0).

 ,      , ,    ,   .

    ,      ,    .        ,        : mukhamediev.ravil@gmail.com (mailto:mukhamediev.ravil@gmail.com), amir_ed@mail.ru.        geoml.info.







  (Machine Learning  ML)   ,    ,   .        .       ,          .      ,   ,  ,  ,    . ML      ,   .            ,     ,  .      , ,    ,     .      ML           (Smart Services). ,   Gartner  2017  ( 1.1), ML      .  ,       ML.






 1.1.  , ,     [[1 - http://www.gartner.com/newsroom/id/3412017 (http://www.gartner.com/newsroom/id/3412017)]]



  ,    ,     : Smart Dust, Machine Learning, Virtual Personal Assistants, Cognitive Expert Advisors, Smart Data Discovery, Smart Workspace, Conversational User Interfaces, Smart Robots, Commercial UAVs (Drones), Autonomous Vehicles, Natural-Language Question Answering, Personal Analytics, Enterprise Taxonomy and Ontology Management, Data Broker PaaS (dbrPaaS)  Context Brokering.

 , ML         .  ML   ,   ,   , .    ML       .

         ,          . ,  RapidMiner [[2 - https://rapidminer.com/ (https://rapidminer.com/)]],     ,   ,         - . Matlab,          MathWorks,                    . GNU Octave    Matlab           Matlab.          .  Octave   [[3 - Octave online. https://octave-online.net/ (https://octave-online.net/) (2017-04-01).]],      [[4 - Octave download. https://www.gnu.org/software/octave/download.html (https://www.gnu.org/software/octave/download.html) (2017-04-01).]]. ,  Octave    ,       https://octave.sourceforge.io/packages.php (https://octave.sourceforge.io/packages.php).

      Python   ,       . ,            Anaconda (https://www.anaconda.com/ (https://www.anaconda.com/)),     Python.  numpy, matplotlib, pandas, sklearn,   Anaconda,               .

      .

  ,    ,     ,       ML,       ..         .           .      .

           ().          ML    .

         .    , ,     ,          ML.

            .

        .

         .

       .

             ML.

         ,         .           .                 .                      .                 .

     .     ,   .         .   ,        ,         ,        ,         .




 I.       





1.    .    



  ()    - ,      .    ,    (Natural Language Processing  NLP),    ,  , ,     [[5 - The Artificial Intelligence (AI) White Paper. https://www.iata.org/contentassets/b90753e0f52e48a58b28c51df023c6fb/ai-white-paper.pdf (https://www.iata.org/contentassets/b90753e0f52e48a58b28c51df023c6fb/ai-white-paper.pdf) (2021-02-23).]].       1.1.






 1.1.   



   ,    ,   ,  ,    .  ,   ML,    , ,     .



.  黠 ,  .         ,     .


       .  ,   [[6 - Nguyen G. et al. Machine Learning and Deep Learning frameworks and libraries for large-scale data mining: A survey // Artificial Intelligence Review. 2019. . 52.  1. . 77124.]]:

  (ML)      ,         (     )        .  ML     (SVM),  ,  ,  k-,   , ,     .

  (NN)   NN    ML,        .       ,   ,  .

  (Deep Learning -DL)    NN,      NN.   DL    ,    (CNN),    (RNN),    (GAN),   .

      1.2.






 1.2.     



          [[7 - Joseph A. Cruz and David S. Wishart. Applications of Machine Learning in Cancer Prediction and Prognosis // Cancer Informatics. 2006. Vol. 2. P. 5977.], [8 - Miotto R. et al. Deep learning for healthcare: Review, opportunities and challenges // Briefings in Bioinformatics. 2017. . 19.  6. . 12361246.]],  [[9 - Ballester, Pedro J. and John BO Mitchell. A machine learning approach to predicting proteinligand binding affinity with applications to molecular docking // Bioinformatics. 2010. Vol. 26.  9. P. 11691175.]], ,   [[10 - Mahdavinejad, Mohammad Saeid, Mohammadreza Rezvan, Mohammadamin Barekatain, Peyman Adibi, Payam Barnaghi, and Amit P. Sheth. Machine learning for Internet of Things data analysis: A survey // Digital Communications and Networks. 2018. Vol. 4. Issue 3. P. 161175.]]   [[11 - Farrar, Charles R. and Keith Worden. Structural health monitoring: A machine learning perspective. John Wiley & Sons, 2012. 66 p.], [12 - Lai J. et al. Prediction of soil deformation in tunnelling using artificial neural networks // Computational Intelligence and Neuroscience. 2016. . 2016. . 33.]],   [[13 - Liakos, Konstantinos et al. Machine learning in agriculture: A review // Sensors. 2018. 18(8). P. 2674.]],   [[14 - Friedrich Recknagel. Application of Machine Learning to Ecological Modelling // Ecological Modelling. 2001. Vol. 146. P. 303310.]]    [[15 -  . .,  . .,  . .           //    . 2018.  3. . 1425.]],       [[16 - Clancy, Charles, Joe Hecker, Erich Stuntebeck, and Tim O?Shea. Applications of machine learning to cognitive radio networks // Wireless Communications, IEEE. 2007. Vol. 14. Issue 4. P. 4752.]],   [[17 - Ball, Nicholas M. and Robert J. Brunner. Data mining and machine learning in astronomy // Journal of Modern Physics D. 2010. Vol. 19.  7. P. 10491106.]],   [[18 - R.Muhamediyev, E. Amirgaliev, S. Iskakov, Y. Kuchin, E. Muhamedyeva. Integration of Results of Recognition Algorithms at the Uranium Deposits // Journal of ACIII. 2014. Vol. 18.  3. P. 347352.], [19 -  . .,  . .,  . .,  . .           //   . 2013.  3. . 8288.]],  [[20 - Chen Y., Wu W. Application of one-class support vector machine to quickly identify multivariate anomalies from geochemical exploration data // Geochemistry: Exploration, Environment, Analysis. 2017. . 17.  3. . 231238.]],    [[21 - Hirschberg J., Manning C. D. Advances in natural language processing // Science. 2015. . 349.  6245. . 261266.], [22 - Goldberg Y. A primer on neural network models for natural language processing // Journal of Artificial Intelligence Research. 2016. . 57. . 345420.]]  ..




1.1.      


          ,   .

        ?   ,           .

  ,    ,   ,     ,     .

      ,    ,    ,  .   ,          .

 ,     ,      , , ,   ,                   ; ,           ,  ,         ().

 ,         ,   ,   ML [[23 -             ,    ,       .]].

    L     [[24 - Taiwo Oladipupo Ayodele. Types of Machine Learning Algorithms // New Advances in Machine Learning. 2010. P. 1948.], [25 - Hamza Awad Hamza Ibrahim et al. Taxonomy of Machine Learning Algorithms to classify realtime Interactive applications // International Journal of Computer Networks and Wireless Communications. 2012. Vol. 2.  1. P. 6973.], [26 - Muhamedyev R. Machine learning methods: An overview // CMNT. 2015. 19(6). P. 1429.], [27 - Goodfellow I. et al. Deep learning. Cambridge: MIT press, 2016. . 1.  2.], [28 - Nassif A. B. et al. Speech recognition using deep neural networks: A systematic review // IEEE Access. 2019. . 7. . 1914319165.]]:    (Unsupervised Learning  UL) [[29 - Hastie T., Tibshirani R., Friedman J. Unsupervised learning. New York: Springer, 2009. P. 485585.]]   ,    (Supervised Learning  SL) [[30 - Kotsiantis, Sotiris B., I. Zaharakis, and P. Pintelas. Supervised machine learning: A review of classification techniques // Emerging Artificial Intelligence Applications in Computer Engineering. IOS Press, 2007. P. 324.]],  ,   (Semi-supervised Learning  SSL),    (Reinforcement Learning  RL)    (Deep Learning).      , ,      ( 1.3).

         UL,            ,      [[31 - Jain A. K., Murty M. N., Flynn P. J. Data clustering: A review // ACM computing surveys (CSUR). 1999. . 31.  3. . 264323.], [32 - Wesam Ashour Barbakh, Ying Wu, Colin Fyfe. Review of Clustering Algorithms. Non-Standard Parameter Adaptation for Exploratory Data Analysis // Studies in Computational Intelligence. 2009. Vol. 249. P. 728.]].        ,   .            .






 1.3.      [[33 - Mukhamediev R. I. et al. From Classical Machine Learning to Deep Neural Networks: A Simplified Scientometric Review //Applied Sciences. 2021. . 11. . 12. . 5541.]]



 SL     .    ,             .     .  ,      ,           ,     .

 SL           (  ),   .         ,       ( 1.4).






a)






b)

 1.4.  ()   (b) 



  1.1         ,     .



 1.1.      






           2.




1.2.       


        :  ,       ,       ,  ,         .      ,        .

 :

     numpy

 ,       pandas, pytables

   scipy, scikit-learn, opencv

 - matplotlib, bokeh, seaborn

  sympy, cython

       (Deep Learning frameworks):

Caffe/Caffe2, CNTK, DL4J, Keras, Lasagne, mxnet, PaddlePaddle, TensorFlow, Theano, Torch, Trax

 1.2       .



 1.2.  ,      







1.3.     


     ,      ,     ,     .  ,  ,   ,   ,     .           .    ,  ,     ,   ,  ,       ,          .   ,       ,      ,         .       1.5  ,   (  ), ,  ,   (, ,   ..),       (  , , )      .






 1.5.     (,   )  





.                  [[34 -  . .       . , 2016. 200 . ISBN 978-9934-14-876-7.]].


 ,       ,                ʻ.

,  ,    ,      , ,      (, ) .

 , ,   ,      ,       . ,        .

               ,        ,        ,  .

            1.6.

         ,        .   ,  ,    ,    ,  .            (train),  (test)      (validation) .        ,           .        ,           ,     .






 1.6.         






1.4.  


1.      ,    ?

2.           ?

3.        ?

4.          ?

5.        _____?

6.    ,       (Supervised Learning).

7.        ?

8.    ,       (Unsupervised Learning).

9.      .  ,          ?




2.   





2.1.     


     (        )    [[35 -  . .  ,   ,  ,  WEKA, RapidMiner  MatLab (      ):  . .: .     . . . , 2010.]].

   : Ob (  ), Y (   )  () .

  y: Ob ? Y,        (  () (sample set))  m:








     ob


, ob


,, ob


.    A (),    ob   y(ob)    ,    .

 ,       ,         .

   Y = {1, 2,, l}     ( l  ).     ,   X    C


,, C


,  Ci = {ob Ob | y(ob) = i}  i{1, 2,, l}:



Ob = ?





C


.


 Y = {(?1,,?l ) |?1,,?l {0,1      l  .  i-   Ci = {ob Ob | y(ob) = (?1,,?l), ?i = 1}.

  ,      A,       (cost function) J(A(ob), y(ob)),  ,    A(ob)      y(ob).     , 








   



J(A(ob), y(ob)) = | A(ob)  y(ob) |






J(A(ob), y(ob)) = (A(ob)  y(ob))2.


  :    ?           ().        ,         ,       .    ,        , ,        , ,   ,      ,      .

 ,   ob    ()    (input values or features) x


,x


,.x


,    ob


? Ob ,  y    ( ) (target value)        .

       ?


? ? ,       ,  (weights) w


? W.

       ,       ?   ,      J(?)    m.

    A    ,           ( )         m.       h


(x),      ?


? ?    J(?).








 m      ; x


       i- ; y


         i- ; h


  ,     (h


 = ?


 + ?


x)   (,       (h


 = ?


 + ?


x + ?


x


).

,       ,     ,          (x),      (y) ( 2.1).






 2.1.      



        .      ,         ,   ,   ..        ,       .

 ,  ,     ?


,   x


?X (  ),  ,         .              0?x?1  1?x?1.  ,       .  ,         ,    ,   (, )    ..        .          ,    ,     (  ,     ),     h


(x),    J(?).




2.2.    


         (.  2.1)  ,      h


= ?


 + ?


x. ,         ( 3.1a).     h


(x)     (gradient descent),        ?


, ?


  :








 ?   ;  


      ?


.  :=  ,      (=),    .

     ,           2.2           .  ,          ,   ,  :








,        :








   :








  ,  x


= 1.           X,       ,   ?.

    1.3  1.4    :








     ?     ()      ?  .

   ,           (Batch Gradient Descent)        .        ?


   :








 ?   ; (X


X)


    X


X; X


   X.

    ,      ?     .       ,     O(n


),   c       .



 .

         . -,   :



%matplotlib inline

import matplotlib.pyplot as plt

import numpy as np

import time



,   time      .       .   ,   30 :



xr=np.matrix(np.linspace(0,10,30))

x=xr.T

#      

y=np.power(x,2)+1

#  () 

plt.figure(figsize=(9,9))

plt.plot(x,y,'.')






 2.2.   y=x


+1



        (m = 30),       .  o      ,  ,   size:



m=x.size

#    X,   

on=np.ones([m,1])

#   X,  

X=np.concatenate((on,x),axis=1)



 ,      ,      x


, x


,, x


.        :



theta=np.matrix('0.1;1.3')

#    

h=np.dot(X,theta)

#    

plt.plot(x,h)



  :






 2.3.    



  ,       .            ( ):



t0=time.time()

alpha=0.05

iterations=500

for i in range(iterations):

theta=theta-alpha*(1/m)*np.sum(np.multiply((h-y),x))

h=np.dot(X,theta)

t1=time.time()

# 

plt.figure(figsize=(9,9))

plt.plot(x,y,'.')

plt.plot(x,h,label='regressionByIteration')

leg=plt.legend(loc='upper right',shadow=True,fontsize='x-small')

leg.get_frame().set_facecolor('#0055DD')

leg.get_frame().set_facecolor('#eeeeee')

leg.get_frame().set_alpha(0.5)

plt.show()



    :






 2.4.     



#  

mse=np.sum(np.power((h-y),2))/m

print('regressionByIteration mse= ', mse)

#      

print('regressionByIterations takes ',(t1-t0))



   :

regressionByIterations mse = 63.270782365456206

regressionByIterations takes 0.027503490447998047

            sklearn.

from sklearn.metrics import mean_squared_error, r2_score

y_predict = h

y_test=y

print("Mean squared error: {:.2f}".format(mean_squared_error(y_test,y_predict)))

print("r2_score: {:.2f}".format(r2_score(y_test, y_predict)))



  :

Mean squared error: 63.27

r2_score: 0.93



       .  3.    ML.




2.3.  


    ,        


,      ()  .          .  ,   ,        ,      ,       .    ,        .   ,      ,  ,     :








  ?      .       ?       .   ,       ,   .        ,                .

        ,   ,        ,    j-       :








 .

          :








      np.array([np.random.rand(x.size)]).T/50,    ( 2.5):
















 2.5.        



  degree,   . ,  degree = 1     (r2_score = 0.27).   ,     . ,  degree = 19 r2_score = 0.90.    lambda_reg    .  ,     ,  :

xr=np.array([np.linspace(0,1,180)])

x=xr.T

print(x.size)

y=f(x)

(x,y)=plusRandomValues(x,y) #  

plt.figure(figsize=(9,9))

plt.plot(x,y,'.')

m=x.size

degree=19 # 

lambda_reg=0.00001

on=np.ones([m,1])

X=on

#         degree

for i in range(degree):

xx=np.power(x, i+1)

X=np.concatenate((X,xx),axis=1)

theta=np.array([np.random.rand(degree+1)])

h=np.dot(X,theta.T)

t0=time.time()

alpha=0.5

iterations=100000

for i in range(iterations):

theta=theta-alpha*(1/m)*np.dot((h-y).T,X) -(lambda_reg/m)*theta

h=np.dot(X,theta.T)

t1=time.time()

plt.plot(x,y,'.')

plt.plot(x,h, label='Regression degree = {:0.2f})'.format(degree)) 

leg=plt.legend(loc='upper left',shadow=True,fontsize=16)

leg.get_frame().set_facecolor('#0055DD')

leg.get_frame().set_facecolor('#')

leg.get_frame().set_alpha(0.9)

plt.show()




2.4. .  


        ,        .    ,     .          ,         ,        ,  y ? {0,1}.

         0 ?h


(x) ?1,     () :








 ?   .

  








 n    (  ) ; g(z)     .

   h


(x) = g(?


x).

,              ,             .        .

 h


(x)     ,     (h


(x)?0.5)   (h


(x)<0.5).   ,    , ,    ( 2.6),    , ,    :








    ..






 2.6. ,      



  ?      ,     :








    ,   +,    ,      y     : 1 0.

   ,  y = 0,   i-   :








 ,                         . -       (maximum-entropy classification  MaxEnt).

      ,          (gradient descent),    Conjugate gradient [[36 - Martin Fodslette M?ller. A scaled conjugate gradient algorithm for fast supervised learning // Neural Networks. 1993. Vol. 6. Issue 4. P. 525533.]], BFGS, L-BFGS  lbfgs [[37 - Dong C. Liu, Jorge Nocedal. On the limited memory BFGS method for large scale optimization // Mathematical Programming. 1989. Vol. 45. Issue 13. P. 503528.]].

         .         . ,     ,             m


axh





(x),  i   .  ,     ,    .

      ,        (    ),       :








,         ,        ( ., ,  [[38 - Derivative of Cost Function for Logistic Regression. https://medium.com/mathematics-behind-optimization-of-cost-function/derivative-of-log-loss-function-for-logistic-regression-9b832f025c2d (https://medium.com/mathematics-behind-optimization-of-cost-function/derivative-of-log-loss-function-for-logistic-regression-9b832f025c2d)]]). ,       ,      ( 1.5),   ,         2.8.

.       .            :



from sklearn.datasets import make_moons, make_circles, make_classification

from sklearn.model_selection import train_test_split

dataset = make_circles(noise=0.2, factor=0.5, random_state=1)

X_D2, y_D2 = dataset

plt.figure(figsize=(9,9))

plt.scatter(X_D2[:,0],X_D2[:,1],c=y_D2,marker='o',

s=50,cmap=ListedColormap(['#FF0000','#00FF00']))

X_train, X_test, y_train, y_test = train_test_split(X_D2, y_D2, test_size=.4, random_state=42)



    ,    1.3.

    :



import matplotlib.pyplot as plt

import numpy as np

from sklearn.metrics import confusion_matrix, classification_report

from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score



         (.     ML).

   logisticFunction(X,theta)  ,        , logRegPredictMatrix(h,threshold).   ,      0  1.          (1  , 0  ),         ()   predicted = 0 If h <threshold  predicted = 1 If h >= threshold.    threshold=0.5.

,       :



def logisticRegressionByNumpy(X,y):

m=y.size

X=np.concatenate((np.ones([m,1]),X), axis=1)

theta=np.array(np.random.rand(X.shape[1]))

h=logisticFunction(X,theta)

alpha=0.05

iterations=1500

lambda_reg=0.01

for i in range(iterations):

theta=theta  alpha*(1/m) *np.dot(X.T,(h-y))-(lambda_reg/m)*theta

h=logisticFunction(X,theta)

return theta,h



       :



theta,h=logisticRegressionByNumpy(X_train,y_train)

predicted_train=logRegPredictMatrix(h,threshold=0.50)

matrix_train = confusion_matrix(y_train, predicted_train)#,labels)

print('Logistic regression')

print('Results on train set')

print('Accuracy on train set: {:.2f}'.format(accuracy_score(y_train, predicted_train)))

print('Conf. matrix on the train \n', matrix_train)

print('Classification report at train set\n',

classification_report(y_train, predicted_train, target_names = ['not 1', '1']))



       accuracy = 0.57,    0.4.  ,     ,     !    ,      ,     .

  ,         (2.10).       :

X=np.concatenate((np.ones([m,1]),X,X**2), axis=1)

     accuracy     ,  0.9,      .

             .

     ,        .

 ,        ,   ,          .

       .        ,         .



.    MLF_logReg_Python_numpy_002.ipynb,    ,    


https://www.dropbox.com/s/vlp91rtezr5cj5z/MLF_logReg_Python_numpy_002.ipynb?dl=0 (https://www.dropbox.com/s/vlp91rtezr5cj5z/MLF_logReg_Python_numpy_002.ipynb?dl=0)




2.5.  


      ?

         ?

         ?

        .

     ?    .

     .     ,  y = 0, h = 0, m = 2?

  ?

   ?

         ?

     ?

    ?




2.6.   





2.6.1.  


   (Artificial Neural Networks  ANN  )  ,      40-   .            ,    ( 70- )       .                Warren S. McCulloch (http://link.springer.com/search?facet-creator=%22Warren+S.+McCulloch%22), Walter Pitts (http://link.springer.com/search?facet-creator=%22Walter+Pitts%22) [[39 - Warren S. McCulloch, Walter Pitts. A logical calculus of the ideas immanent in nervous activity // The bulletin of mathematical biophysics. 1943. Vol. 5. Issue 4. P. 115133.]],    [[40 - Rosenblatt, F. The perceptron: A probabilistic model for information storage and organization in the brain // Psychological Review. 1958. Vol. 65 (6). P. 386408.]]   .       .   .  [[41 - Minsky M. L., Papert S. A. Perceptrons: An Introduction to Computational Geometry. MIT, 1969. 252 p.], [42 - Marvin Minsky, Seymour Papert. Perceptrons, expanded edition. The MIT Press, 1987. 308 p.]].           ,         ,  ,      XOR    .          .    60-    ,     .

 1974    ,        (backpropagation) [[43 - Werbos P. Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. Harvard University, 1974. 38 p.], [44 - Werbos P. J. Backpropagation: past and future // IEEE International Conference on Neural Networks. San Diego, 1988. Vol. 1. P. 343353.]],     ,      ()        .          .



.       . ,      . .       1974 .       ,    , , ,    backpropagation  1986 .      .


          [[45 - : .   . .: -  . . . , 2004. 320 .], [46 -  . .       // : , . .: , 2006.  2. . 4971.], [47 -  . .    :    . .: , 2008. 176 .], [48 -  . .  :  .    , 2010. 496 .]],         ,       .



.          ,    ,        .            [[49 - Connectionism. Internet Encyclopedia of Philosophy.https://iep.utm.edu/connect/#:~:text=Connectionism%20is%20an%20approach%20to,%2C%20neuron%2Dlike%20processing%20units (https://iep.utm.edu/connect/#:~:text=Connectionism%20is%20an%20approach%20to,%2C%20neuron%2Dlike%20processing%20units)]].


   ANN    ,     ()   ,       (Feed Forward Neural Networks) [[50 - David Saad. Introduction. On-Line Learning in Neural Networks. Cambridge University Press, 1998. P. 38.]].  1989    G. Gybenco [[51 - Cybenco G. Approximation by superpositions of a sigmoidal function // Mathematics of Control, Signals, and Systems. 1989. Vol. 4. P. 304314.]], K. Hornik [[52 - Hornik K. et al. Multilayer feedforward networks are universal approximators // Neural Networks. 1989. Vol. 2. P. 359366.]]  . ,         .           .      90- ,      ,      .                     [[53 - Schmidhuber, J?rgen. Deep learning in neural networks: An overview // Neural Networks. 2015. Vol. 61. P. 85117.]],       ,    , ,  ,    ,    .        27 [[54 - http://www.asimovinstitute.org/neural-network-zoo/ (http://www.asimovinstitute.org/neural-network-zoo/)  THE NEURAL NETWORK ZOO POSTED ON SEPTEMBER 14, 2016 BY FJODOR VAN VEEN]].          .

  ANN        ,   , ,  ,  , , .




2.6.2.     


 ANN    .         ,    , ,      ( 2.7).






 2.7.   



   :






 g(z)   .



           (Eq. 2.9).

       ,     .

      ?      w,           ,         (weight).  ,          (W)   .     ,      ,  ,   ,    ,      ..

       ,      ,     1.5.     ,         .

      ,      a





], a





], a





], a





], a





]      a





].      ,     ,    .      .     L-         a


] = x.

       








  :








   j,     i:








    bias        


,

 w





]     j.

  :








,     2.8         ,     :








    :











 2.8.       



   w   ( )   ,       (Eq. 2.12).








 L     ; s


     l; K    (     ); W   .

      c  .      ,   ,          ,          1 ( 1.5).            ,    h


(x


) > 0.5,      .  ,      ,    (  ),       2      ( 1.6),     ..






 2.9.      



 ,       ,      (Backpropagation of errors  BPE) [[55 - Werbos P. Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. Harvard University, 1974. 38 p.]]   ,     .




2.6.3.    


  BPE   .    








    :








 1.        ,       ,        ,     1.4, 1.5.        a


],a


],,a


]     ().

 2.       a


= h











 (x),        y


,    :








 L      .

 3.  ,         :








  *    ; g' .



   :








    :








    :








    dz


], dz


],  , dz


   .

 4.           I ? L:








 i    ; ?    (learning rate) (0 < ? < 1); ?


    i; dz


    i-  ( ,  ).



   ,   14          .

           ( 2.10):






 2.10.      



      ,      .



      



   ( 2.11)       w,  b.  ,   ,        ,       ,       .   -   .






 2.11.       



     x           a


].    x1 = 0, x2 = 1,  a





] = x1 = 0  a





] = x2 = 1.  (bias)    a





= 1.



  ,  ,   [1,0,1],       y=1.

 1.   .

       :








  :








 2.    .

    y


 = 1,    0.78139. , c      ,        .








 3.   .

      ,    .        ,  

















      , ,   ,    .

 4.    .

        (learning rate) ro = 0.5. ,     ro   0.1.      ,         .

  (Eq. 2.18)     :








  :








   ,           :








,     .

   dz





] = 0.14184



.     BPE   Python-numpy   MLF_Example_Of_BPE  https://www.dropbox.com/s/tw6zwht3d5pd4zf/MLF_Example_Of_BPE.html?dl=0 (https://www.dropbox.com/s/tw6zwht3d5pd4zf/MLF_Example_Of_BPE.html?dl=0)


,  ,        ,           .        L-      :

















 W


   i-   ; X      n x m (n   , m    ).

         :



















.  ,     ,        .         ,       .   ,           .  ,     ,    ,   ,     .           .


,  ,   ,                 .      . Batch Gradient Descent    ,      .              W


].

,     ,    ,       ,        .      [[56 - Batch, Mini-Batch & Stochastic Gradient Descent. https://towardsdatascience.com/batch-mini-batch-stochastic-gradient-descent-7a62ecba642a (https://towardsdatascience.com/batch-mini-batch-stochastic-gradient-descent-7a62ecba642a)]]:

Stochastic Batch Gradient Descent       ,      .

Mini Batch Gradient Descent       .



.            8, 16, 32, , 1024    ,        - .


             BPE.          (solver).   lbfs, adam. ,    (multilayer perceptron  MLP)       :

from sklearn.neural_network import MLPClassifier

clf = MLPClassifier(hidden_layer_sizes = [10, 10], alpha = 5, random_state = 0, solver='lbfgs')



  MLPClassifier    2.8   .




2.6.5.  


          .         ,    .             ( 2.12),  


























 2.12.  ,    



 :       ? , :       ,    .  ,           .            .   ,     ReLU        . ,      .   ,               ,        0  1.




2.7.  


       ?

         ?

   .

     .

       ?

      .

      ?    ?

         ?

     ?

,         .

       ,    ?

     ,    ?

      ReLU?

       ?

          ?




2.8.   


    ,      TensorFlow [[57 -    :  . https://www.tensorflow.org/tutorials/keras/classification (https://www.tensorflow.org/tutorials/keras/classification)]]. TensorFlow         ,        MLPClassifier.     ,             (28  28).   Fashion-MNIST  60 000     10 000  ,          .  10  . ,   0  9,       2.13.






 2.13.  Fashion-MNIST



Fashion-MNIST        MNIST,     Hello, World         . MNIST     (0, 1, 2  ..)  ,      Fashion-MNIST.      MNIST   ,            .

   ,  keras.     :



# TensorFlow  tf.keras

import tensorflow as tf

from tensorflow import keras

#  

import numpy as np

import matplotlib.pyplot as plt



         :

fashion_mnist = keras.datasets.fashion_mnist

(X_train1, y_train),(X_test1,y_test)= fashion_mnist.load_data()

plt.figure()

plt.imshow(X_train1[10])

plt.colorbar()

plt.grid(False)

plt.show()








 ,       0  255.        ,     .      ,          0  1,      255:

X_train1=X_train1/255.0

X_test1=X_test1/255.0



,       ,      28 x 28   .         784:

X_train=np.reshape(X_train1,(X_train1.shape[0],X_train1.shape[1]*X_train1.shape[2]))

X_test=np.reshape(X_test1,(X_test1.shape[0],X_test1.shape[1]*X_test1.shape[2]))



   X_train  (60 000, 28, 28)      (60 000, 784),         .

from sklearn.neural_network import MLPClassifier

clf = MLPClassifier(hidden_layer_sizes = [15, 15,],

alpha = 0.01,random_state = 0,

solver='adam').fit(X_train, y_train)



      .       :



predictions=clf.predict(X_test)

print('Accuracy of NN classifier on training set: {:.2f}'

.format(clf.score(X_train, y_train)))

print('Accuracy of NN classifier on test set: {:.2f}'

.format(clf.score(X_test, y_test)))

print(classification_report(y_test,predictions))

matrix = confusion_matrix(y_test, predictions)

print('Confusion matrix on test set\n',matrix)



 accuracy    :

Accuracy of NN classifier on training set: 0.89

Accuracy of NN classifier on test set: 0.86



         alpha (, hidden_layer_sizes = [75, 75], alpha = 0.015),    :

Accuracy of NN classifier on training set: 0.91

Accuracy of NN classifier on test set: 0.88



.    MLF_MLP_Fashion_MNIST_001.ipynb      https://www.dropbox.com/s/ryk05tyxwlhz0m6/MLF_MLP_Fashion_MNIST_001.html?dl=0 (https://www.dropbox.com/s/ryk05tyxwlhz0m6/MLF_MLP_Fashion_MNIST_001.html?dl=0)




2.9.  k   (k-Nearest Neighbor  k-NN)


 [[58 - Dudani, Sahibsingh A. The Distance-Weighted k-Nearest-Neighbor Rule // Systems, Man, and Cybernetics. 1976. Vol. SMC-6. Issue 4. P. 325327.], [59 - K-Nearest Neighbors algorithm. http://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm (http://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm) (2012-07-05).]]          ()     () .      ,        .    ,       .

   ,         .  ,       10      2  15 /     1,        .

            .  ,   ,         .            ( ),      :  (Manhattan),   (Chebyshev),  (Minkowski)  .

      :








 w(i, u)   i-    u; a(u;Xl)    u,    Xl.

     ,   ,            ,        .             (    )  .  ,  ,           .

      ,     .      .            .

 .

       :

from sklearn import neighbors

clf = neighbors.KNeighborsClassifier(n_neighbors=5, weights='distance')



      Fashion-MNIST.     ,       KNeighborsClassifier  ,  MLP,     : 10 000     2000  :

X_train1=X_train1[0:10000,:,:]

y_train=y_train[0:10000]

X_test1=X_test1[0:2000,:,:]

y_test=y_test[0:2000]



  :



from sklearn.neighbors import KNeighborsClassifier

clf = KNeighborsClassifier(n_neighbors = 5, weights='distance')

clf.fit(X_train, y_train)



     ,     .   ,  :



Accuracy of kNN classifier on training set: 1.00

Accuracy of kNN classifier on test set: 0.82



.  MLF_KNN_Fashion_MNIST_001.ipynb,    ,      https://www.dropbox.com/s/ei1tuaifi2zj2ml/MLF_KNN_Fashion_MNIST_001.html?dl=0 (https://www.dropbox.com/s/ei1tuaifi2zj2ml/MLF_KNN_Fashion_MNIST_001.html?dl=0)





2.10.   


   (Support Vector Machines) [[60 - Support vector machine. http://en.wikipedia.org/wiki/Support_vector_machine (http://en.wikipedia.org/wiki/Support_vector_machine) (2012-02-22).]]     :       .       .        .           ,     ,    .  R3, ,     .

    ,     (  ), ..  ,              .       f(x)  :








 ?w,s?   ; w   ()    ; b   ,          .   b  ,     .

,   f(x) = 1,    ,    f(x) = -1   .

         ,       .   (    )    .     w  b,   .

         .     Rn   H     : ? = Rn ? H.        , ..       : f(x)=sign(?w,?(x)?+b).

          :








        ,     :








 S


 S


  ,  log(h


)  log(1h


)      (f2) (  - ); f


  ,   ?         .     


,    x      x


      ,        ,   ?,     (C=1/?).

           .  ,     .

 .

      :



from sklearn.svm import SVC

clf = SVC(kernel = 'rbf', C=1)



     Fashion-MNIST.      SVC  ,  MLP, ,      KNeighborsClassifier,     : 10 000     2000  .     :



from sklearn.svm import SVC

clf = SVC(kernel = 'rbf',C=1).fit(X_train, y_train)



      accuracy:

Accuracy of SVC classifier on training set: 0.83

Accuracy of SVC classifier on test set: 0.82



, ,      (.      ),    accuracy,   0.87.



.  MLF_SVC_Fashion_MNIST_001.ipynb,   ,      https://www.dropbox.com/s/0p1i1dqk8wqwp5x/MLF_SVC_Fashion_MNIST_001.html?dl=0 (https://www.dropbox.com/s/0p1i1dqk8wqwp5x/MLF_SVC_Fashion_MNIST_001.html?dl=0)


  scikit-learn       GaussianProcessClassifier, DecisionTreeClassifier, GaussianNB  .    ,   ,     [[61 - Classifier comparison. https://scikit-learn.org/stable/auto_examples/classification/plot_classifier_comparison.html (https://scikit-learn.org/stable/auto_examples/classification/plot_classifier_comparison.html)]],            .




2.11.     .   





2.11.1.        


        .  ML   ,  .          .          .

,      ,         P(A),    B  P(B)      B    A  P(B|A),            B:








 .

,          ,    , ,   :








        (14)      ,    : sunny  , rainy  , overcast  .  ,    ,     (sunny).       ,    ('yes')   'Sunny',     :

P('yes'|'Sunny').



 ,      ,   = 'yes'    ,  B = 'Sunny'.

        ,     P('Sunny')      P('yes').  ,    ,        P('Sunny'|'yes').           :

P('yes'|'Sunny') = P('Sunny'|'yes') * P('yes') / P('Sunny')

 ,     . , :

A_value = 'yes'

B_hypothes = 'Sunny'

       :

P(A_value|B_hypothes) = P('yes'|'Sunny') = P('Sunny'|'yes') * P('yes') / P('Sunny')

  :

P('Sunny'|'yes') = 3 / 9 = 0.33

      ,   :

P('Sunny') = 5 / 14 = 0.36

P('yes') = 9 / 14 = 0.64

  , :

P('yes'|'Sunny') = 0.33 * 0.64 / 0.36 = 0.60.




2.11.2.  Na?ve Bayes


  ,       ,     , ,  ,    ..?             :








 NBI


      (Na?ve Bayes Inference); i  i-e     F (features),    . ,   P('yes')= P('no'),      1.  ,      ,     :








    :








 freq  ; N      .    P('Sunny'|'yes') = 3 / 9 = 0,33.

  Eq. 2 NBI    0  +?.  NBI < 1,        ('no').  NBI > 1,    ,         ('yes'). ,      Eq. 2,        .

 ,         'yes'  'no',      ,          .     ,        ,    ,        .             .           , ,   :








   F     .    F = 2  :  (Weather)    (Field).   , N       ,        ,   , 9.

            ,   log(a*b) = log(a) + log(b).       :








        .     ,            -?  +?.  NBI


   0,     ,   0,      ( 2.14).






 2.14.    Na?ve Bayes    Eq. 2.21 ()  Eq. 2.24 ()



  Na?ve Bayes       (Eq. 2.23, 2.24).       Eq. 2.24.

 .

   ,    .         Field.       (Weather, Field)    Play:








-    ,        ,        (bad, good):

(P('yes'|'Sunny' & 'good').



 ,   :

P('Sunny'|'yes') = 3 / 9 = 0.33



  :

P('Sunny'|'no') = 2 / 5 = 0.4

P('good'|'yes') = 5 / 9 = 0.5555

P('good'|'no') = 2 / 5 = 0.4



    Eq. 2.1:

P('yes'|'Sunny' & 'good') = [P('Sunny'|'yes') / P('Sunny'|'no')] * [P('good'|'yes') / P('good'|'no')] = 1.574,



   ,    ,     P('yes'),    ,      P('no'),    1, , ,  .



.   NBA     ML_Lab01.2_NaiveBayesSimpleExampleByPython  https://www.dropbox.com/sh/oto9jus54r4qv7x/AAAcOtl9SE-i6b1zViwMP6Wga?dl=0 (https://www.dropbox.com/sh/oto9jus54r4qv7x/AAAcOtl9SE-i6b1zViwMP6Wga?dl=0)





2.11.3.     Na?ve Bayes



 

,    ,    .     , Na?ve Bayes Algorithm (NBA)   ,     (logistic regression),        .

NBA     ,   .      ,     .


 

         ,       ,            .        (zero frequency).       .          (Laplace smoothing).

 NBA   ,        .       ,   predict_proba.

   NBA     .         .

     - ,     . -,       . -,  ,    ,            .    ,      :      ,       .       , :  ,     .         (),   ,    ,   , , , .                    .              .          ,   .    ,      , , .

     ,        ,      ,               .    ,     ,   .




2.11.4.    


     . NBA   ,           . NB    .

 ,  ,   ,  ,  ,   .            , NBA   .            ,    (sentiment analysis),  ,   (information retrieval),    (author identification),    (word disambiguation).

 . NBA    ,         (collaborative filtering) [[62 - _. ru.wikipedia.org/wiki/_; https://en.wikipedia.org/wiki/Collaborative_search_engine (https://en.wikipedia.org/wiki/Collaborative_search_engine)]].       .                 .   ,        ,         .




2.12.    . 


 ,       ,       . , ,      ,    ,    ,    ,   ,   .

  ,   [[63 - Friedman, Jerome H. Greedy function approximation: agradient boosting machine // Annals of Statistics. 2001. P. 11891232.]],   ,            h


(x)     (a)    , ,     (b)   h


(x) ,     :








 ,       :








 L   ,     a  b.     J


(?)      .

     :








,    


      ,  (b)  ,      


,   


,      (b)   (x


, y


)   (x


,L'(y


, h


(x


).  J


(?)   ,    ()  ..

 ,    [[64 - . http://www.machinelearning.ru/wiki/index.php?title= (http://www.machinelearning.ru/wiki/index.php?title=%D0%91%D1%83%D1%81%D1%82%D0%B8%D0%BD%D0%B3)]],                 .  ,               .        ,         .     ,         ,    .

          .     ,     ,  XGBoost (Extreme Gradient Boosting).       :



import xgboost

clf = xgboost.XGBClassifier(nthread=1)



 XGBClassifier    Fashion-MNIST:



clf = xgboost.XGBClassifier(nthread=4,scale_pos_weight=1)

clf.fit(X_train, y_train)



nthread   ,          .



,     :

Accuracy of XGBClassifier on training set: 0.88

Accuracy of XGBClassifier on test set: 0.86



      .             ,  :

##X_train1=X_train1/255.0

##X_test1=X_test1/255.0



      ,     .



.         ,     .  ,    Fashion-MNIST    10 . ,   Fashion-MNIST   XGBoost (MLF_XGBoost_Fashion_MNIST_001),     https://www.dropbox.com/s/frb01qt3slqkl6q/MLF_XGBoost_Fashion_MNIST_001.html?dl=0 (https://www.dropbox.com/s/frb01qt3slqkl6q/MLF_XGBoost_Fashion_MNIST_001.html?dl=0)





2.13.   .   


   (Principal Component Analysis  PCA)        ,   ,    .     ,          .           .                         ,   ,         ,   ,        .        [[65 - Pearson K. On lines and planes of closest fit to systems of points in space // Philosophical Magazine. 1901. Vol. 2. P. 559572.], [66 - Sylvester J. J. On the reduction of a bilinear quantic of the nth order to the form of a sum of n products by a double orthogonal substitution // Messenger of Mathematics. 1889. Vol. 19. P. 4246.]].

    ,         (),     .  ,          .        .

 1.              .  ,   ,   :








, ,  X     m x n (m    , n     ,  ),   :








 2.  ,               ,    .  ,     S,    v    ,    w  , :








 w        S.       :

   S,        n x n,  n   .

    V  n x n,   n   ,      n .

   n  ,     x


.

  n     k,     , ,       (variation).  ,  ,        ,   .

 ,    V,        x.    Vreduced.         X:

Z= Vreduced*X.T.

     Z,   X    .      ,    Z   X,       ,    x


 .

        .   2.15a     ,    200       .   :

X = np.dot(np.random.random(size=(2, 2)), np.random.normal(size=(2, 200))).T



  ,       :

S=(1/X.shape[1])*np.dot(X.T,X) #covariance matrix

w, v = np.linalg.eigh(S)



      v,          z  zz:

vreduced=v[:,1]

vreduced1=v[:,0]

z=np.dot(vreduced,X.T)

zz=np.dot(vreduced1,X.T)



,         ,    ( 2.15b).  ,      ,      . , ,    ( 2.15b)  (  ),      .

,          ,   :

Xa= Vreduced*Z.



 ,     , , ,  ( 2.15).






a)   ,      






b)       (    )






)      .      

 2.15.     PCA



   ( 2.15)  ,   PCA    ,     .   ,           y ( ),   PCA     ( 2.16).






 2.16.     ()  PCA ()





.         MLF_PCA_numpy_001.ipynb  https://www.dropbox.com/s/65y1z7svf7epx1q/MLF_PCA_numpy_001.html?dl=0 (https://www.dropbox.com/s/65y1z7svf7epx1q/MLF_PCA_numpy_001.html?dl=0)


 scikit-learn      PCA,            ,        z.



.     PCA    scikit-learn ,     ML_lab08_Principal Component Analysis  https://www.dropbox.com/sh/xnjiztxoxpqwos3/AADoUPfNeMnEXapbqb3JHHvla?dl=0 (https://www.dropbox.com/sh/xnjiztxoxpqwos3/AADoUPfNeMnEXapbqb3JHHvla?dl=0)





2.14.  


     k-NN     ?

      ?

      .

   Na?ve Bayes?

   Na?ve Bayes.

   Na?ve Bayes.

       Na?ve Bayes?

      Na?ve Bayes?

  ?

       ?

  PCA?

    ,    PCA?




3.   ML



      ML    ,    .



.               ,    ,   .


      ,    ML     .



. ,       ,      https://www.dropbox.com/s/nc1qx6tjw11t5gs/MLF_Evaluation001.ipynb?dl=0 (https://www.dropbox.com/s/nc1qx6tjw11t5gs/MLF_Evaluation001.ipynb?dl=0)


 ,  ,     ,     ,    ,        . ,   ,        ,  ,             .   ,      ,       (SVM),       .  ,            (,  ,      ..),     ,     .



.   , ,     .


       ,   ,         .

           ,    ML,    . ,   ,     ,      (user satisfaction)   ,       (amount of revenue),       (patient survival rates)  ..      ,         ,   .

     , ,    ,     ML,   ,     .         ()   .             :  (precision),  (recall),      F1  F (F1 score  F-score).



. ,      ,     ,      ,     .


         ,           .      , :



Accuracy

Precision

Recall

F1 score

F-score

Area Under the Curve (AUC)



 ,      :

1. Precision-Recall curve

2. ROC curve



        ML    ,          .  ,  ,         ,      ,       .             ,      ,          ,      ,              .

 ,          ,    .     ML                .




3.1.    


                 (accuracy)  Correct Classification Rate (R)       (     ):








 N


    ; N    .

    ,         (  ,  ,   skewed classes),    ,         A. ,   1-  90%    ,   2-   10%,     ,     1- ,      90%.  ,         2- ,        A.  ,    2-   ,  A     .       ,     :  (precision),  (recall),     F1 score (    F1),      :








  .

     (    1 (positive)    ,    0 (negative)).      :








 True positive (TP)  True negative (TN)     , ..     . C, False negative (FN)  False positive (FP)    . FN      ,        ,     .         () , .. ML-   ,       . FP    , ,   ,  , ,   ML-   ,       .

Precision (P)              ,      .   , Recall (R)            .

    P,  R    .  R ,            (  R)     (negative)  ,     P ,     ,     (  P)   (positive) . , ,       1, ,  ,          P  R,     P     R,  .   3.1a  3.1b         precision  recall,      ,   ,       .




  .


   .

   ,     (https://www.litres.ru/chitat-onlayn/?art=70255501)  .

      Visa, MasterCard, Maestro,    ,   ,     ,  PayPal, WebMoney, ., QIWI ,       .



notes








1


http://www.gartner.com/newsroom/id/3412017 (http://www.gartner.com/newsroom/id/3412017)




2


https://rapidminer.com/ (https://rapidminer.com/)




3


Octave online. https://octave-online.net/ (https://octave-online.net/) (2017-04-01).




4


Octave download. https://www.gnu.org/software/octave/download.html (https://www.gnu.org/software/octave/download.html) (2017-04-01).




5


The Artificial Intelligence (AI) White Paper. https://www.iata.org/contentassets/b90753e0f52e48a58b28c51df023c6fb/ai-white-paper.pdf (https://www.iata.org/contentassets/b90753e0f52e48a58b28c51df023c6fb/ai-white-paper.pdf) (2021-02-23).




6


Nguyen G. et al. Machine Learning and Deep Learning frameworks and libraries for large-scale data mining: A survey // Artificial Intelligence Review. 2019. . 52.  1. . 77124.




7


Joseph A. Cruz and David S. Wishart. Applications of Machine Learning in Cancer Prediction and Prognosis // Cancer Informatics. 2006. Vol. 2. P. 5977.




8


Miotto R. et al. Deep learning for healthcare: Review, opportunities and challenges // Briefings in Bioinformatics. 2017. . 19.  6. . 12361246.




9


Ballester, Pedro J. and John BO Mitchell. A machine learning approach to predicting proteinligand binding affinity with applications to molecular docking // Bioinformatics. 2010. Vol. 26.  9. P. 11691175.




10


Mahdavinejad, Mohammad Saeid, Mohammadreza Rezvan, Mohammadamin Barekatain, Peyman Adibi, Payam Barnaghi, and Amit P. Sheth. Machine learning for Internet of Things data analysis: A survey // Digital Communications and Networks. 2018. Vol. 4. Issue 3. P. 161175.




11


Farrar, Charles R. and Keith Worden. Structural health monitoring: A machine learning perspective. John Wiley & Sons, 2012. 66 p.




12


Lai J. et al. Prediction of soil deformation in tunnelling using artificial neural networks // Computational Intelligence and Neuroscience. 2016. . 2016. . 33.




13


Liakos, Konstantinos et al. Machine learning in agriculture: A review // Sensors. 2018. 18(8). P. 2674.




14


Friedrich Recknagel. Application of Machine Learning to Ecological Modelling // Ecological Modelling. 2001. Vol. 146. P. 303310.




15


 . .,  . .,  . .           //    . 2018.  3. . 1425.




16


Clancy, Charles, Joe Hecker, Erich Stuntebeck, and Tim O?Shea. Applications of machine learning to cognitive radio networks // Wireless Communications, IEEE. 2007. Vol. 14. Issue 4. P. 4752.




17


Ball, Nicholas M. and Robert J. Brunner. Data mining and machine learning in astronomy // Journal of Modern Physics D. 2010. Vol. 19.  7. P. 10491106.




18


R.Muhamediyev, E. Amirgaliev, S. Iskakov, Y. Kuchin, E. Muhamedyeva. Integration of Results of Recognition Algorithms at the Uranium Deposits // Journal of ACIII. 2014. Vol. 18.  3. P. 347352.




19


 . .,  . .,  . .,  . .           //   . 2013.  3. . 8288.




20


Chen Y., Wu W. Application of one-class support vector machine to quickly identify multivariate anomalies from geochemical exploration data // Geochemistry: Exploration, Environment, Analysis. 2017. . 17.  3. . 231238.




21


Hirschberg J., Manning C. D. Advances in natural language processing // Science. 2015. . 349.  6245. . 261266.




22


Goldberg Y. A primer on neural network models for natural language processing // Journal of Artificial Intelligence Research. 2016. . 57. . 345420.




23


            ,    ,       .




24


Taiwo Oladipupo Ayodele. Types of Machine Learning Algorithms // New Advances in Machine Learning. 2010. P. 1948.




25


Hamza Awad Hamza Ibrahim et al. Taxonomy of Machine Learning Algorithms to classify realtime Interactive applications // International Journal of Computer Networks and Wireless Communications. 2012. Vol. 2.  1. P. 6973.




26


Muhamedyev R. Machine learning methods: An overview // CMNT. 2015. 19(6). P. 1429.




27


Goodfellow I. et al. Deep learning. Cambridge: MIT press, 2016. . 1.  2.




28


Nassif A. B. et al. Speech recognition using deep neural networks: A systematic review // IEEE Access. 2019. . 7. . 1914319165.




29


Hastie T., Tibshirani R., Friedman J. Unsupervised learning. New York: Springer, 2009. P. 485585.




30


Kotsiantis, Sotiris B., I. Zaharakis, and P. Pintelas. Supervised machine learning: A review of classification techniques // Emerging Artificial Intelligence Applications in Computer Engineering. IOS Press, 2007. P. 324.




31


Jain A. K., Murty M. N., Flynn P. J. Data clustering: A review // ACM computing surveys (CSUR). 1999. . 31.  3. . 264323.




32


Wesam Ashour Barbakh, Ying Wu, Colin Fyfe. Review of Clustering Algorithms. Non-Standard Parameter Adaptation for Exploratory Data Analysis // Studies in Computational Intelligence. 2009. Vol. 249. P. 728.




33


Mukhamediev R. I. et al. From Classical Machine Learning to Deep Neural Networks: A Simplified Scientometric Review //Applied Sciences. 2021. . 11. . 12. . 5541.




34


 . .       . , 2016. 200 . ISBN 978-9934-14-876-7.




35


 . .  ,   ,  ,  WEKA, RapidMiner  MatLab (      ):  . .: .     . . . , 2010.




36


Martin Fodslette M?ller. A scaled conjugate gradient algorithm for fast supervised learning // Neural Networks. 1993. Vol. 6. Issue 4. P. 525533.




37


Dong C. Liu, Jorge Nocedal. On the limited memory BFGS method for large scale optimization // Mathematical Programming. 1989. Vol. 45. Issue 13. P. 503528.




38


Derivative of Cost Function for Logistic Regression. https://medium.com/mathematics-behind-optimization-of-cost-function/derivative-of-log-loss-function-for-logistic-regression-9b832f025c2d (https://medium.com/mathematics-behind-optimization-of-cost-function/derivative-of-log-loss-function-for-logistic-regression-9b832f025c2d)




39


Warren S. McCulloch, Walter Pitts. A logical calculus of the ideas immanent in nervous activity // The bulletin of mathematical biophysics. 1943. Vol. 5. Issue 4. P. 115133.




40


Rosenblatt, F. The perceptron: A probabilistic model for information storage and organization in the brain // Psychological Review. 1958. Vol. 65 (6). P. 386408.




41


Minsky M. L., Papert S. A. Perceptrons: An Introduction to Computational Geometry. MIT, 1969. 252 p.




42


Marvin Minsky, Seymour Papert. Perceptrons, expanded edition. The MIT Press, 1987. 308 p.




43


Werbos P. Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. Harvard University, 1974. 38 p.




44


Werbos P. J. Backpropagation: past and future // IEEE International Conference on Neural Networks. San Diego, 1988. Vol. 1. P. 343353.




45


: .   . .: -  . . . , 2004. 320 .




46


 . .       // : , . .: , 2006.  2. . 4971.




47


 . .    :    . .: , 2008. 176 .




48


 . .  :  .    , 2010. 496 .




49


Connectionism. Internet Encyclopedia of Philosophy.https://iep.utm.edu/connect/#:~:text=Connectionism%20is%20an%20approach%20to,%2C%20neuron%2Dlike%20processing%20units (https://iep.utm.edu/connect/#:~:text=Connectionism%20is%20an%20approach%20to,%2C%20neuron%2Dlike%20processing%20units)




50


David Saad. Introduction. On-Line Learning in Neural Networks. Cambridge University Press, 1998. P. 38.




51


Cybenco G. Approximation by superpositions of a sigmoidal function // Mathematics of Control, Signals, and Systems. 1989. Vol. 4. P. 304314.




52


Hornik K. et al. Multilayer feedforward networks are universal approximators // Neural Networks. 1989. Vol. 2. P. 359366.




53


Schmidhuber, J?rgen. Deep learning in neural networks: An overview // Neural Networks. 2015. Vol. 61. P. 85117.




54


http://www.asimovinstitute.org/neural-network-zoo/ (http://www.asimovinstitute.org/neural-network-zoo/)  THE NEURAL NETWORK ZOO POSTED ON SEPTEMBER 14, 2016 BY FJODOR VAN VEEN




55


Werbos P. Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. Harvard University, 1974. 38 p.




56


Batch, Mini-Batch & Stochastic Gradient Descent. https://towardsdatascience.com/batch-mini-batch-stochastic-gradient-descent-7a62ecba642a (https://towardsdatascience.com/batch-mini-batch-stochastic-gradient-descent-7a62ecba642a)




57


   :  . https://www.tensorflow.org/tutorials/keras/classification (https://www.tensorflow.org/tutorials/keras/classification)




58


Dudani, Sahibsingh A. The Distance-Weighted k-Nearest-Neighbor Rule // Systems, Man, and Cybernetics. 1976. Vol. SMC-6. Issue 4. P. 325327.




59


K-Nearest Neighbors algorithm. http://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm (http://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm) (2012-07-05).




60


Support vector machine. http://en.wikipedia.org/wiki/Support_vector_machine (http://en.wikipedia.org/wiki/Support_vector_machine) (2012-02-22).




61


Classifier comparison. https://scikit-learn.org/stable/auto_examples/classification/plot_classifier_comparison.html (https://scikit-learn.org/stable/auto_examples/classification/plot_classifier_comparison.html)




62


_. ru.wikipedia.org/wiki/_; https://en.wikipedia.org/wiki/Collaborative_search_engine (https://en.wikipedia.org/wiki/Collaborative_search_engine)




63


Friedman, Jerome H. Greedy function approximation: agradient boosting machine // Annals of Statistics. 2001. P. 11891232.




64


. http://www.machinelearning.ru/wiki/index.php?title= (http://www.machinelearning.ru/wiki/index.php?title=%D0%91%D1%83%D1%81%D1%82%D0%B8%D0%BD%D0%B3)




65


Pearson K. On lines and planes of closest fit to systems of points in space // Philosophical Magazine. 1901. Vol. 2. P. 559572.




66


Sylvester J. J. On the reduction of a bilinear quantic of the nth order to the form of a sum of n products by a double orthogonal substitution // Messenger of Mathematics. 1889. Vol. 19. P. 4246.


