News
MT4J - Framework Extensions: Week 7-8
Sorry for the delay to this update - but I wanted to wait for the major changes to complete before giving you an update.
Yesterday, I have moved the main devlopment to the MT4j Google Code repository (Branches/CSSBranch) - this was necessary, as i need to use inheritance to effectively make use of my code - if you like to follow code development, you can do it from there. On this repository I have created a new project, which contains the tests I need to test my changes.
Anyway, here are the changes of the past 2 weeks:
- Major bugfixes to the font parsing sections
- Added relative sizing
- Many more tests
- Support of background images - tiled and single
- Support for Background: Border: and Font: multi-parameter tags
- Major cleanup of the code
- Outsource the component handling to the CSSHelper class
- Outsource all enums and lists of values to the CSSKeywords class
- Consolidated many classes
- Updated class naming
- Integration into Mt4j (Branches/CSSBranch)
- Integration of the CSSStyleManager into MTApplication
- Integration of the functionality into MTPolygon and MTLine - nearly all components extend these classes
- Integration of the CSSID in MTComponent
Right now, the latter points are untested, stay tuned for more.
Community Core Audio (Alpha): CCA - Preview Release 20070708
Introduction
Community Core Audio(CCA) is a GSoC Project that is has the similar GUI to CCV. CCA manages voice inputs, converts voice to text, and outputs messages to network.
Now The preview release of CCA (for windows) was available to download at: http://nuicode.com/attachments/download/169/cca_20070707.zip
Below is a screen shot:
Usage¶
The current version only support command-picking mode. So just do not click the "FREE SPEAKING MODE" button in this preview version. :)
For detail of these two modes, please read: http://nuicode.com/documents/75
Select the check box "RECORD SOUND" to start recording. The waveform will be showed at the viewer window dynamically.
Unselect the check box "RECORD SOUND" or Click the "STOP" button to stop recording.
Select the check box "PLAY/PAUSE" to play, unselect it to pause. Click the "STOP" button to stop playing.
After recoding a audio, click the "SENT TO RECOGNIZE ENGINE", and the output viewer will display the sentence you just record.
The current version only support English digits because of the simple sphinx resources.
You can click the "CLEAR SCREEN" button to clear the output viewer.
-------------------------------------------------------------------------------Configuration¶
For normally use, you do not need to do any configuration, what you need is just download and run it. However, CCA provide some options through config files.
The most important config file is $cca_path/data/config.xml. If you want to use new sphinx resources, you must specify the path of new resource files in this XML file. To learn about resource files, please read: http://nuicode.com/documents/74
The input audio sample rate was also set in config.xml. The input sample rate must be same as the sample rate of the Acoustics Model (AM). AM is a part of the resource files.
The file $cca_path/data/commandList.txt is for CommandPicking mode. See this document: http://nuicode.com/documents/75
-------------------------------------------------------------------------------Some Technical Detail¶
A stand alone oF addon for speech recognition, ofxASR, was released for days. ofxASR is the core engine of CCA, and it can be applied on any oF application. Currently it use CMU Sphinx3 as its Automatic Speech Recognition(ASR) engine, but it also designed allowing to use other ASR engine, such as Microsoft's SAPI or Mac OSX Speech API. All engine share the same interface.
ofxASR is hosted at: http://code.google.com/p/ofxasr/
A class ofRectPrint was developed to print lines of string in a rectangle with auto scroll and scroll up/down.
-------------------------------------------------------------------------------TODO¶
- Ship better sphinx resources that support any English words instead of digits.
- The free-speaking mode.
- Output to network.
- OSX and Linux support.
MT4J - Framework Extensions: Week 6
This week, I have finally managed to integrate the parser and the CSSStyle class into some of the major Components of MT4j, like MTRectangle, MTPolygon and MTTextArea, and have successfully applied all supported CSS syntax.
I have also added full inheritance support (Global inheritance) and local stylesheets, that can be applied to single objects, which trump the global stylesheets.
To enable inheritance, I use the CSS selectors as follows:
Type Selectors: Select a specific class, like MTTextArea, but not classes, that are descendents of that class (like MTTextField ??)
Class Selectors: Selects a class and all descendents
ID Selectors: MTComponents can be assigned a specific ID, which can be used to assign the style to all objects with that ID
As integration progresses, testing gets more and more complicated, as threading makes it hard to execute certain functions from the JUnit test class, so I ended up writing pseudo-tests in real applications, and check the success using custom functions. As Therion has created a custom Font test fro met to overcome such issues, I will try to clean up those tests and redo them properly once I got the hang of it.
MT4J - Framework Extensions: Week 5
This week the development has progressed more rapidly than before - Uni is finally over (to continue with a Master programme in October), so I finally have time to concentrate on the project.
This week, I have concentrated on writing tests for the CSS file parser and all CSS attributes I support. These tests include:
- Color
- Measures
- Fonts:
-- Serif, Sans, Mono, Normal
-- Bold, Italic, Oblique, Light, Normal
-- Font Color
-- Font-Size
-- Custom Fonts
- Selectors
- Borders
As these tests helped me to identify some bugs in the programming I have adjusted the code so all tests ran well (before the modifications below).
Additionally, I have started integrating the resuslt from the parser into the classes of MT4J, currently using specializations for testing. I have created MTCSSReactangle (extending MTRectangle) to test these modifications.
Furthermore, I have moved the Font generation into the CSSStyle class, to allow dynamic generation of fonts.
Finally, the inheritance system based on CSS selectors has been created in a basic form, yet I still have to test this functionality.
Stay tuned for week 6.
MT4J - Framework Extensions: Week 4
Week 4 has been all about testing the parser - it took me some time to set up the testing environment (a little frustrating in concurrent environments, especially logging is painful), but I have finally managed to set up testing with JUnit 4, using a custrom pseudo MTApplication for the generation of IFonts.
The Following Tests have been set up (and the functions have already been bugfixed:)
Width + Height
Measuring Units
Colors: Hex/RGB/Name
Next on the list is the testing of the Font parsing, I expect loads of fun...
Community Core Audio (Alpha): A preview version of current CCA (1 comment)
Very glad to announce that CCA has have its basic functions.
Now it can record a sound, and convert it to text.
For convenient of debug, I use a very simple Acoustic Model that can only recognize a single english digit, and the converted text was not displayed in the view window but print directly in windows cmd, as the screenshot as below:

Download the preview version to try this program (for windows): http://nuicode.com/attachments/download/164/cca_20100617.zip
Usage:
Click "RECORD" to record, and click the "SENT TO RECOGNIZE ENGINE", the converted text will printed in cmd.
Note the current version can only recognize a single english digit (one, two, ... nine), and the FREE SPEAKING MODE is unsupported now.
MT4J - Framework Extensions: Week 3
This week has been all about fonts - especially fonts that are considered standard/generic in web development. I have integrated the DejaVu font set as standard font set to cover the generic fonts present in CSS, combined with the styles italic/oblique and normal, and weights normal, bold and light.
Changes:
Added font support in the CSS parser:
Integrated the DejaVu Open Source Font
Generic types: Sans-Serif, Serif, Monospace
Styles: Italic/Oblique, Normal
Weight: Bold, Normal, Light
Added support for custom fonts, specified as {font-family: Helvetica.ttf}
Automatic loading of appropriate fonts from the DejaVu-Set
Work in Process - Compiles, but I cannot get it to run just now, as my link to the svn project seems to be broken
MT4J - Framework Extensions: Week 2
As these are the weeks of my final (final final) exams at my University, progress is kinda slow as I have to devote most of my time to studying statistics and managerial accounting. Yet I have decided to at least achieve some measurable result every week until the exams are over (June 18th), before I can start coding at full speed again.
So, here's my achievements for this week:
Added parsing and conversion for measuring units:
px
pc
em
in
cm
mm
I have used the value of 100dpi for the conversion of units, while I use a 16pt fonts as base for comparison for the em conversion
Added parsing for Colors:
rgb(12,34,56) format
#123456 format
16 web colours (black, white, purple, green, red...)
Added parsing for boolean values:
e.g. visibility
Added parsing for BorderStyles:
none, hidden, solid, dashed, dotted
That's it for this week, stay tuned for more.
MT4J - Framework Extensions: First Week
Sunday...I finally managed to submit some code to the subversion repository, though I have to admit that most of the time of this week had to be spent on mourning on my Win7 Dev machine which decided to give me headaches right at the start of GSoC.
Anyway, I finally managed to get the project to Eclipse on my primary machine, a Mac Pro (Eclipse is a bit more buggy on a Mac...), the problem being, that Eclipse on a Mac doesn't seem to import class folders properly, which took me some time to correct.
But here's the accomplishments of the week.
- Migrated the project to my Mac as my Win7 Dev Machine isn't cooperating
- Added basic data structures:
- CSSStyle + BorderStyle
- Selector + SelectorType
- Created some sample extensions to generic MT4j Classes (MTCSSRectangle etc.)
- Created a class (parserConnector) for accessing the Apache Batik CSS Parser
- Integrated the Apache Batik project and the SAC 1.3 interfaces as interface to the CSS files
- Started Implementation of CSS DocumentHandler, which processes the CSS files
- Current Status: Parsing of Selectors, including descendants, implemented
Community Core Audio (Alpha): Weekly Progress Report 20100519
Now CCA have has its basic GUI, can record and play. The Windows version and the OSX version is developing synchronous. 
I spent some time to compile sphinx3, explore it and wrote a hello world program. You can download it here: http://nuicode.com/attachments/download/160/SimpleSphinxDecoder_win32.zip
This helloworld is a vs2008 project, includes compiled sphinx3's lib and dll. As I described in my GSoC proposal, CCA will support two mode: free speaking mode and command picking mode. In this hello world the two mode is supported, but you must modify the config file to select the mode you wanted. That need a little knowledge of ASR, if you have no idea of how to do it, just ask me in IRC or skype. :)
BTW, compared with HTK, Sphinx is so complicated when you want to use a custom grammar instead of the default word graph.
« Previous 1 2 3 4 Next »
Also available in: Atom
