News

Community Core Audio: The release as the achievement of GSoC 2010

Added by Jimbo Zhang 21 days ago

GSoC 2010 has come to the end, so I release the current version here as the achievement during this year's GSoC.
What's New in this release:
- Can recognize most English words instead of digits only.
- Added free-speaking mode(not very accuracy, so using commandpicking mode is recommend).
- Output to network by TCP.
- OSX and Linux support.

You can download CCA executable binary for your platform:
OSX : http://nuicode.com/attachments/download/177/CCA_GSoCFinal_bin_OSX.zip
Linux(32bit) : http://nuicode.com/attachments/download/179/CCA_GSoCFinal_bin_linux32.tar.gz
Windows(32bit) : http://nuicode.com/attachments/download/178/CCA_GSoCFinal_bin_win32.zip

Or check out from svn: svn checkout http://nuicode.svnrepository.com/svn/cca

ofxASR is a part of CCA, but it is a stand alone openframeworks addon.

ofxASR is host at http://code.google.com/p/ofxasr/.

And here is a thread about it at openframeworks forum: http://www.openframeworks.cc/forum/viewtopic.php?f=10&t=4202

Note that this is not CCA's "Final" version. The developing of CCA will never stop.

I'm proud of CCA. I'm proud of being a contributor of NUI Group. I will keep developing and maintaining CCA, make it stronger and stronger, and let it be downloaded by millions one day.

Thanks our NUI Group, thank you my mentor tito and Christian, thanks all people that once gave me advices and feedbacks. What a good summer!

-------------------------------------------------------------------------------
TODOs in future:
- A GUI sentence list editor. (Currently user need to edit the config file manually)
- A setting dialog.
- Automatic stop recoding(do not need press stop button by users).
- Other required features from users.

Community Core Audio: CCA Development Progress 20100722

Added by Jimbo Zhang about 1 month ago

As be required by many folks, CCA have successfully compiled under Mac OS X. An advantage of OSX is that applications can be packaged as a bundle. All resources are in a single app bundle showed as the screenshot below:

The config.xml and the list of recognizable sentences are also in the bundle. Users have to use "Show Package Contents" to view and edit them.

Another good news is that CCA can recognize not only digits but also any English word and sentence now, as the following screenshot:

As seen in the screenshot, CCA is still under CommandPicking mode. That means you must edit the file Command.list in the bundle to specify which sentences can be recognized by CCA.

The FreeSpeaking mode is also available now. But this mode is very slow(may be it need a progress bar) and not accuracy. So the CommandPicking mode is recommended. In future(after the gsoc) I will do something to make the FreeSpeaking mode more useful.

The newest CCA for OSX can download at: [[http://nuicode.com/attachments/download/174/CCA_OSX_20100722.zip]]

Community Core Audio: First preview version for OS X

Added by Jimbo Zhang about 1 month ago

Hi all,

CCA has released its first preview version for OS X. It was packaged as an OS X sytle application bundle. You can download it here: http://nuicode.com/attachments/download/171/CCA_OSX_20100721.zip

The usage is the same as the windows version, please check http://nuigroup.com/log/cca_preview_release/ to learn how to use it.

Unfortuntly it is very inaccurate, that means it usually gives wrong result under OS X, while the windows version is very accurate. I'm not sure if there's something wrong in my code. I guess this problem is because there might be some differences between Mac and PC's hardware. The codes are tracked at http://nuicode.com/projects/cca-alpha/repository, it will be great if anyone can fix it.

Another bug known: The application's name and its path name should not contain any space, or it may crash. This bug is from Sphinx and I will report it to Sphinx Team.

MT4J - Framework Extensions: Week 7-8

Added by Michael Magin about 1 month ago

Sorry for the delay to this update - but I wanted to wait for the major changes to complete before giving you an update.

Yesterday, I have moved the main devlopment to the MT4j Google Code repository (Branches/CSSBranch) - this was necessary, as i need to use inheritance to effectively make use of my code - if you like to follow code development, you can do it from there. On this repository I have created a new project, which contains the tests I need to test my changes.

Anyway, here are the changes of the past 2 weeks:

  • Major bugfixes to the font parsing sections
  • Added relative sizing
  • Many more tests
  • Support of background images - tiled and single
  • Support for Background: Border: and Font: multi-parameter tags
  • Major cleanup of the code
  • Outsource the component handling to the CSSHelper class
  • Outsource all enums and lists of values to the CSSKeywords class
  • Consolidated many classes
  • Updated class naming
  • Integration into Mt4j (Branches/CSSBranch)
    • Integration of the CSSStyleManager into MTApplication
    • Integration of the functionality into MTPolygon and MTLine - nearly all components extend these classes
    • Integration of the CSSID in MTComponent

Right now, the latter points are untested, stay tuned for more.

Community Core Audio: CCA - Preview Release 20070708

Added by Jimbo Zhang about 1 month ago

Introduction

Community Core Audio(CCA) is a GSoC Project that is has the similar GUI to CCV. CCA manages voice inputs, converts voice to text, and outputs messages to network.

Now The preview release of CCA (for windows) was available to download at: http://nuicode.com/attachments/download/169/cca_20070707.zip

Below is a screen shot:

-------------------------------------------------------------------------------

Usage

The current version only support command-picking mode. So just do not click the "FREE SPEAKING MODE" button in this preview version. :)
For detail of these two modes, please read: http://nuicode.com/documents/75

Select the check box "RECORD SOUND" to start recording. The waveform will be showed at the viewer window dynamically.

Unselect the check box "RECORD SOUND" or Click the "STOP" button to stop recording.

Select the check box "PLAY/PAUSE" to play, unselect it to pause. Click the "STOP" button to stop playing.

After recoding a audio, click the "SENT TO RECOGNIZE ENGINE", and the output viewer will display the sentence you just record.
The current version only support English digits because of the simple sphinx resources.

You can click the "CLEAR SCREEN" button to clear the output viewer.

-------------------------------------------------------------------------------

Configuration

For normally use, you do not need to do any configuration, what you need is just download and run it. However, CCA provide some options through config files.

The most important config file is $cca_path/data/config.xml. If you want to use new sphinx resources, you must specify the path of new resource files in this XML file. To learn about resource files, please read: http://nuicode.com/documents/74

The input audio sample rate was also set in config.xml. The input sample rate must be same as the sample rate of the Acoustics Model (AM). AM is a part of the resource files.

The file $cca_path/data/commandList.txt is for CommandPicking mode. See this document: http://nuicode.com/documents/75

-------------------------------------------------------------------------------

Some Technical Detail

A stand alone oF addon for speech recognition, ofxASR, was released for days. ofxASR is the core engine of CCA, and it can be applied on any oF application. Currently it use CMU Sphinx3 as its Automatic Speech Recognition(ASR) engine, but it also designed allowing to use other ASR engine, such as Microsoft's SAPI or Mac OSX Speech API. All engine share the same interface.

ofxASR is hosted at: http://code.google.com/p/ofxasr/

A class ofRectPrint was developed to print lines of string in a rectangle with auto scroll and scroll up/down.

-------------------------------------------------------------------------------

TODO

  • Ship better sphinx resources that support any English words instead of digits.
  • The free-speaking mode.
  • Output to network.
  • OSX and Linux support.

MT4J - Framework Extensions: Week 6

Added by Michael Magin about 1 month ago

This week, I have finally managed to integrate the parser and the CSSStyle class into some of the major Components of MT4j, like MTRectangle, MTPolygon and MTTextArea, and have successfully applied all supported CSS syntax.

I have also added full inheritance support (Global inheritance) and local stylesheets, that can be applied to single objects, which trump the global stylesheets.

To enable inheritance, I use the CSS selectors as follows:

Type Selectors: Select a specific class, like MTTextArea, but not classes, that are descendents of that class (like MTTextField ??)
Class Selectors: Selects a class and all descendents
ID Selectors: MTComponents can be assigned a specific ID, which can be used to assign the style to all objects with that ID

As integration progresses, testing gets more and more complicated, as threading makes it hard to execute certain functions from the JUnit test class, so I ended up writing pseudo-tests in real applications, and check the success using custom functions. As Therion has created a custom Font test fro met to overcome such issues, I will try to clean up those tests and redo them properly once I got the hang of it.

MT4J - Framework Extensions: Week 5

Added by Michael Magin 2 months ago

This week the development has progressed more rapidly than before - Uni is finally over (to continue with a Master programme in October), so I finally have time to concentrate on the project.

This week, I have concentrated on writing tests for the CSS file parser and all CSS attributes I support. These tests include:
- Color
- Measures
- Fonts:
-- Serif, Sans, Mono, Normal
-- Bold, Italic, Oblique, Light, Normal
-- Font Color
-- Font-Size
-- Custom Fonts
- Selectors
- Borders

As these tests helped me to identify some bugs in the programming I have adjusted the code so all tests ran well (before the modifications below).

Additionally, I have started integrating the resuslt from the parser into the classes of MT4J, currently using specializations for testing. I have created MTCSSReactangle (extending MTRectangle) to test these modifications.

Furthermore, I have moved the Font generation into the CSSStyle class, to allow dynamic generation of fonts.

Finally, the inheritance system based on CSS selectors has been created in a basic form, yet I still have to test this functionality.
Stay tuned for week 6.

MT4J - Framework Extensions: Week 4

Added by Michael Magin 2 months ago

Week 4 has been all about testing the parser - it took me some time to set up the testing environment (a little frustrating in concurrent environments, especially logging is painful), but I have finally managed to set up testing with JUnit 4, using a custrom pseudo MTApplication for the generation of IFonts.

The Following Tests have been set up (and the functions have already been bugfixed:)
Width + Height
Measuring Units
Colors: Hex/RGB/Name

Next on the list is the testing of the Font parsing, I expect loads of fun...

Community Core Audio: A preview version of current CCA (1 comment)

Added by Jimbo Zhang 2 months ago

Very glad to announce that CCA has have its basic functions.
Now it can record a sound, and convert it to text.

For convenient of debug, I use a very simple Acoustic Model that can only recognize a single english digit, and the converted text was not displayed in the view window but print directly in windows cmd, as the screenshot as below:

Download the preview version to try this program (for windows): http://nuicode.com/attachments/download/164/cca_20100617.zip

Usage:
Click "RECORD" to record, and click the "SENT TO RECOGNIZE ENGINE", the converted text will printed in cmd.
Note the current version can only recognize a single english digit (one, two, ... nine), and the FREE SPEAKING MODE is unsupported now.

MT4J - Framework Extensions: Week 3

Added by Michael Magin 2 months ago

This week has been all about fonts - especially fonts that are considered standard/generic in web development. I have integrated the DejaVu font set as standard font set to cover the generic fonts present in CSS, combined with the styles italic/oblique and normal, and weights normal, bold and light.

Changes:
Added font support in the CSS parser:
Integrated the DejaVu Open Source Font
Generic types: Sans-Serif, Serif, Monospace
Styles: Italic/Oblique, Normal
Weight: Bold, Normal, Light
Added support for custom fonts, specified as {font-family: Helvetica.ttf}
Automatic loading of appropriate fonts from the DejaVu-Set

Work in Process - Compiles, but I cannot get it to run just now, as my link to the svn project seems to be broken

1 2 3 4 Next »

Also available in: Atom