PyCon 2014

El algoritmo Knn (K-Nearest Neighbor) con Python

Portia Burton  · 




Extracto de la transcripción automática del vídeo realizada por YouTube.

alright good morning everyone so for our next talk we have Portia Burton who will be giving a talk titled know thy neighbor Sai kit and the k-nearest neighbor Algrim so please give a hand for Portia hello um thank you for coming this presentation is going

to be on the k-nearest neighbor and scikit-learn um we're going to keep this presentation high level so we're not going to get too deep and most of this presentation will be based on the site kittler documentation so for those of you who are part of

up keeping the documentation thank you a little bit about me um I'm from Portland I'm the organizer of the Portland data science group we're basically on a gang of people like visualizing data we like analyzing data once a month we have talks and

once a month we have a half day it's really I really love this group I'm also volunteer of hack Oregon hack Oregon is a civic organization that takes campaign data and makes it more accessible to the people of the state of Oregon and I am also founder

of the company plb analytics what will we cover today a brief intro to machine learning will go over scikit-learn will explain the k-nearest neighbor algorithm and finally we our demo will be based on the site kit learn package and KNN machine learning what

is machine learning we've had many presentations during this pike on that explains machine learning but I'm just going to get to the core of it machine learning is basically when an algorithm learns from the data it takes the data and it makes assumptions

what does machine learning algorithms use this data for well to give you a sampling it creates predictive models classifies unknown entities and discovers patterns once again this is a sampling of what machine learning algorithms can do today we are mostly

going to concentrate on the creating of predictive models and classifying unknown entities this is my basic workflow and machine learning when I'm dealing with data i spend about seventy percent cleaning standardizing data putting it into postgres chasing

straight commas away I twenty percent of my time is used to pre-process train and validate the data and ten percent is used to analyze and visualize my data when I visualize my data I try to use a d3 JavaScript it's a great package for interactive on visualizations

and it's a lot of fun to use scikit-learn so I can't learn is basically pythons answer to machine learning it's a Python machine learning package um scikit-learn is great because it has a very good documentation it is well written and finally it

has a great data set for example the Boston housing market today we will be making use of a psychic learn data set that's already part of the package if you're new to machine learning and you're using Python this cheat sheet is a great way to figure

[ ... ]

Nota: se han omitido las otras 1.347 palabras de la transcripción completa para cumplir con las normas de «uso razonable» de YouTube.