Extracto de la transcripción automática del vídeo realizada por YouTube.
- My name is Hsing-Hui Hsu, that actually is how it's spelled. All the H's are silent, so on twitter you can find me by SoManyHs. Today I'm gonna talk to you about Parsing. For those of you who saw Aaron Patterson's talk this morning on the Ruby Virtual machine, my talk covers all the boring steps that his talk skipped over.
So if you still feel like you want to skip over those steps, now is your chance. A little explanation about the title of this talk. I probably first started thinking about parsing when I was teaching english as a foreign language, and I had to explain why so many words in english, like flies, had different meanings, and depending on which meaning the word had, the sentence can go in really different directions.
When I started looking into how computer languages are parsed, I noticed that there are actually a lot of similarities to the way humans parse sentences in languages that they hear and read every day. I'm a relative newcomer to the programming world, so I never learned about parsers or compilers in school, or anything.
So I have an alt title for this talk which is, How I accidentally a Computer Science, and So Can You! Of course, this isn't going to be a comprehensive survey of parsers, but I just wanted to share how I came about learning about them as someone who doesn't have a CS degree, because I had no idea what they were and I sort of ran into them by accident.
I had been building rails applications for a while, and one day I got tired of just blindly trusting all that rails magic, and I wanted to see how things really worked underneath the hood. Specifically, I wanted to figure out how routing worked, so I went to GitHub and started poking around the rail source code, and I came across a file called parser.
rb and I was like, "cool. " I'd done an exercise that was sort of like this, so I thought I probably know what the code would look like, but then I looked at the file, and it was nothing what I expected. It was just a bunch of arrays with numbers, where was the logic? Who even wrote this stuff? So I was like, "Okay, that's not enough.
" Looking closer, I saw that the previous file was actually generated using a different file, called parser. y. So I went and looked at that. Still didn't make any sense, and barely looked like any ruby code that I recognized, there's semicolons, what? So that's when I decided to figure out what all this meant.
So to get warmed up to the idea of parsing, let's play a game. I hope that most of you have played it before, but for those of you who may not be familiar, this is a word game where one person asks for certain kinds of words, such as a noun or an adjective to make sentences.
So, for example, I need a verb, anyone? Did I hear drank? Sure, drank. Great, awesome. The young man drank. Awesome. That's a valid sentence right? If you can remember your middle school grammar lessons, we can diagram this sentence into a subject and a verb, where the verb is drank, and since all sentences need a subject to actually do the verb, we know that the subject is the young man.
Makes sense so far right? But let's see if there are other possible sentences that could have started with 'the young man. ' So for example, the young man did something, so here I need a verb and a noun. So we can stick with our original verb drank. So any suggestions for a noun? Beer, now my crowd.
So the young man drank beer. Also grammatical. This time, instead of just subject + verb, we have subject + verb + object, which again abides by the rules of english grammar, if we interpret the young man as the subject, drank as the verb, and beer as the direct object.
But still, can we come up with even more possibilities for valid sentences that start with 'the young man,' either using subject + verb, or subject + verb + object. Or even something else. So for example, I need a noun. Anyone? So for example, is this a grammatical sentence; the young man the boat.
Yes, no, do we know? Based on the previous examples, at first it's easy to think that it's not grammatical, 'cos we assume that the subject for this sentence is still the young man. So then when we see the boat, our brains are like, "nope, not a sentence.
" And this is because we know that a subject has to be followed by a verb, but it turns out you can parse this sentence in a different way. So here, if we interpret the subject as 'the young,' as in young people, we see that it still follows the same rules as the sentence 'the young man drank beer,' which was subject + verb + object.
So we were initially led astray because we tend to think of man as a noun, and therefore, in the previous sentence, as part of the subject and not a verb. So this kind of sentence is called a 'Garden Path Sentence. ' Garden Path Sentences are sentences that contain a word that could be interpreted in more than one way, so that you think the sentence is going to follow one structure before pivoting on that ambiguous word, and going in a different direction.
Time flies like an arrow; fruit flies like a banana follows that pattern. As well as, the man who hunts ducks out on weekends. The prime number few. When Fred eats food gets thrown. Mary gave the child the dog bit a bandaid. And the woman who whistles tunes pianos.
So ambiguity in words or sentence structure can also produce hilarious unintentional second meanings for newspaper headlines, such as 'Grandmother of eight makes a hole in one. ' Man eating piranha mistakenly sold as pet fish. Eye drops off shelf. Complaints about NBA referees growing ugly.
And my personal favorite, Milk drinkers are turning to powder. Regardless of the ambiguity in natural languages like english, we can all agree that as speakers of a language, there are certain grammar rules that we are all aware of, that allow us to make decisions as to whether or not a sentence is a valid expression in that language.
As we've seen from the last few examples, we know that one possible kind of valid sentence in english consists of subject + verb + stuff. And again, most of us probably did some kind of sentence diagramming in middle school that looked a little bit like this.
[ ... ]
Nota: se han omitido las otras 3.035 palabras de la transcripción completa para cumplir con las normas de «uso razonable» de YouTube.