Sunday, July 18, 2010

The BIG LIE of Japanese language learning.

Kanji don't have readings.

"Of course they do. My kanji flash cards have lists of readings for each kanji."

But that misrepresents the history and structure of Japanese.

Japanese began as a spoken language. For most of its history, there was no way to write down Japanese. Words were combinations of sounds, and certain sounds were used over and over to express the same concept in different words.

Then someone was hanging out in China and noticed they had a way of scribbling on bark to represent words. Certain characters related to concepts, and groups of characters represented words of the Chinese language. "Holy crap!" he thought. "That's the most useful things I've seen in my life." So he spent a while collecting a huge list of words - meanings and how they were written in Chinese characters.

He brought this big list of words back to Japan, and started assigning spoken Japanese words to written Chinese ones. For each Japanese word he'd try to find the best matching written Chinese word, and for each written Chinese word he'd try to find the best matching spoken Japanese word. Sometimes there would be several written Chinese words that mapped onto the same spoken Japanese word, sometimes several spoken Japanese words mapped onto a single written Chinese word.

So, back before written language, there was one Japanese word for "hot". Well, not really, there were all sorts of Japanese words for different levels of temperature in different contexts. But the sound "ATSUI" was widely used to mean "high temperature". Along come the big list of written Chinese words, which also had a variety of expressions for "hot". And the written Chinese word for "hot" as in "hot to the touch" and the written Chinese word for "hot" as in "it is hot outside" were both mapped onto the spoken Japanese word "ATSUI".

I've heard it explained that "Japanese has two different words for 'hot', they are written differently but pronounced the same" as if it were coincidence that two closely related words happened to be homophones. Bull. The correct way to express this is "The Japanese word 'ATSUI' means 'hot'. When you're speaking or writing a children's book in kana, that's all you need to know. But when you start reading and writing kanji, it gets more complicated, because there are several ways to write 'ATSUI'. If you mean 'hot to the touch' you write it one way, if you mean 'it is hot outside' you write it another. You'll have to remember which is which."

In languages with phonetic alphabets, words are related by CONCEPT <-> PRONUNCIATION <-> WRITING. This holds pretty well even for Chinese, which uses an ideographic alphabet - because the written and spoken language came from the same culture, the sounds that were reused to represent the same concept in various words simply became written ideograms reused to represent the same concept in various words. Thus, when you see a written word, you have a pretty good idea how it will be pronounced.

But because Japanese imported someone else's written language wholesale and grafted onto their spoken language, it is much messier. For Japanese words, CONCEPT -> PRONUNCIATION and CONCEPT -> WRITING. PRONUNCIATION and WRITING are only consistent to the degree that (spoken) Japanese and Chinese happened to have similar degrees of meaning, and reuse the same word-units in different words. For simple concepts and compounds, this happens a lot... in fact, it is surprising how well so much of the language lines up, considering that the alignment is basically coincidental. But the most basic kanji are also the ones used in the most different words, which is why 'fire' and 'water' have about a thousand readings each.

Bulk learning the readings of each kanji is a big waste of time - kanji don't have readings. Kanji represent words, and words have pronunciations.

Japanese starts to make a lot more sense when you realize this.

First off, the craziness of "kanji readings" makes sense when you realize there is really no such thing. Imagine there is a certain species of tree. In Chinese, this type of tree is called "Mountain Tree" for the location where it is found. However the Japanese gave it the spoken name "Green Tree" for its verdant color. When the kanji were imported to Japan, this species was identified, and so also it is spoken with sounds that mean "Green Tree", it is written with kanji that mean "Mountain Tree." So the kanji for "mountain" is pronounced "green" in the case of this species of tree - not because "it has that reading" but simply because the two cultures found different ways to describe that particular tree, and the written name was imported long after the spoken name was established. TODO: actual examples

You may even encounter AB vs BA - e.g. Chinese use the term "fireball" but Japanese use the term "ballfire" - so for that particular word the kanji "fire" is pronounced "ball", and vice-versa. TODO: actual examples

On the flipside, many words that seem unrelated when you look at the kanji, divulge their relationships if you use your ears. If you encounter two Japanese words that have a syllable in common, and there is some overlap in the meaning of the words, those words are probably etymologically related, even if the kanji used to write them are completely different. TODO: actual examples

For instance, there are a bunch of spoken sounds and written characters that mean "large" "big" "high" "great". Japanese might use the same sound for "tall" and "high priced", but "high speed" uses a different sound. Chinese uses the same character for "tall" as for "high speed". So Japanese has a handful of sounds for big/tall/great/high and Chinese has a handful of characters for the same... each has their own connotations, but there are so many words and compounds which make use of the general concept that it would be ridiculous to expect the Japanese and Chinese to have happened to pick the exact same variant for every single word. Instead, pretty much every combination of sound and writing occurs.

Specifically:

Chinese character 高 means high/tall. Used in Japanese word 高い たかい TAKAI meaning tall/high. This character is also used in a metric ton of Chinese words: 高原 highland, 高速 high speed, 高校 high school, 高価 high-priced. However in each of those cases, the Japanese spoken sound こう KOU is used for the concept of "bigness"... the pronunciations are こうげん KOUGEN, こうそく KOUSOKU, こうこう KOUKOU, こうか KOUKA respectively. Once we recognize こう KOU as a sound that Japanese uses for the concept of "bigness", we find other words that map this sound onto various kanji: "洪水 こうずい flood" obviously means "high water", the KOU in words such as "後期 こうき latter period" and "後半 こうはん second half" could well be related. Of course not all KOU sounds mean "high", there are various clusters of meaning including "travel", "school", and "public" that use the KOU sound.

In fact, rather than memorizing KANJI -> READINGS a much better use of your time would be to memorize SOUNDS -> CONCEPTS at a syllable/word fragment/concept level, eg

KOU -> high, etc
SHA -> car/carriage/vehicle, person, etc
SEI -> government, etc

Then when you hear an unknown word, you can make a pretty good idea of what the possible meanings are - context will often make completely clear which option is correct. Learning KANJI -> CONCEPTs via RTK will give you a pretty good guess at what a written word means - context will often make completely clear which option is correct. Brute force memorization of "kanji readings" will give you... what... a half-assed guess at how to read a word? Which context won't help with? Which is only useful if one of the possible readings happens to be a word you recognize when spoken? Besides, you need to be able to comprehend written Japanese, and comprehend spoken Japanese, to get anything done in the language. How often do you really need to read aloud from text?

SOUND -> CONCEPT is a natural, native Japanese relation. KANJI -> CONCEPT is a natural, native Chinese relation. KANJI -> SOUND is a freakish, unpredictable result of the frankenstein nature by which two different language systems were smashed together. Trying to learn to read by memorizing the sounds of each character is very foreign to the nature of the Japanese language - an arrogant attempt to impose a practice natural to other languages onto Japanese learning.

Eventually, to be fluent, you'll be able to read aloud from Japanese text. But that is best accomplished by mastering each of the spoken and written systems, and learning how they align - not by brute forcing the second-hand relationship between characters and sounds. After brute-forcing kanji via RTK, and brute-forcing listening comprehension via SRS drills, then learning to read will provide lots of "aha!" moments as the two systems reinforce each other by how they overlap.

TODO: museum, coliseum, library, apothecary
TODO: 戦[たたか]い war, match 戦[たたか]う wage war, fight - rather than SEN, obv related to tataku (hit)
TODO: お 陰[かげ] (thanks to) 影[かげ] (shadow, shade)

Tuesday, July 13, 2010

Unity notes

MONITOR

Multiple monitor output: seems to require desktop spanning.

http://answers.unity3d.com/questions/640/fullscreen-on-a-second-monitor
http://answers.unity3d.com/questions/7396/dual-monitor-support
http://answers.unity3d.com/questions/3899/using-multiple-monitors-video-outputs



Use camera viewports to place camera output at certain locations onscreen.
http://unity3d.com/support/documentation/Components/class-Camera.html


how to set app to non-standard size?

if "fullscreen" is only single monitor, how to draw across full desktop without decoration?

is there a way to find total monitor-spanned desktop size? size of second monitor? for now, ability for user to enter second monitor resolution, and using that to place the output viewport in upper right corner of screen, would work.

Supposedly full-screen-on-second-monitor for windows was addressed and would be available in future version (presumably by 3.0) http://answers.unity3d.com/questions/640

There is also a workaround for window: set app res to match monitor res, then place window so header and frame are exactly offscreen... this also addresses the pops-out-of-fullscreen-when-app-focus-changes problem. Even a util for it: http://gwr.orekaria.com/



VIDEO

Video: no position control in native Unity: http://answers.unity3d.com/questions/5197/how-to-create-a-slider-to-control-video-playback

is is possible to get video resolution: http://answers.unity3d.com/questions/18945/how-to-get-video-resolution

it is theoretically possible to access pixels of iplimage from OpenCV and copy to Unity texture: http://answers.unity3d.com/questions/15574/import-video-from-camera

opencv cross-platform support notes: http://answers.unity3d.com/questions/16460/plugins-on-osx-opencv-and-emgucv

hey, this seems promising: a cross-platform QT plugin! http://www.unifycommunity.com/wiki/index.php?title=QTPlayback it doesn't have position control, but includes (non-free) source for the plugin

PLUGIN

native implementations for OSX and Windows http://unity3d.com/support/documentation/Manual/Plugins.html

Sunday, July 11, 2010

Choosing a 3D engine

Creating own VJ / video installation software. Need to choose the right engine.

Requirements:

* unlimited layers
* physics
* paths/splines
* highres support
* multiscreen support (eg matrox out)
* audio input
* open source
* cross platform
** windows
** osx
** linux
** iphone
** consoles
* graphics card accelerated
* saleable
* real-time editing (see changes without restart)
* reasonable end-user GUI controls

Candidates:

* max/msp jitter
* max + ogre
* unity
* torque
* C
* neo axis engine
** ogre-based
** windows only, OSX is "in progress"
* http://www.rtsoft.com/novashell/
* http://game-editor.com/
* http://love2d.org/

Software design:

* source - clip art / pattern
** tiling, movement, etc
* objects
* paths
*

Test app:

* create layers
* assign clip art to layers
* position/scale of each layer