PCM Audio Format

Pulse Code Modulation (PCM) is a method to represent sampled analog signals in digital form, which is the standard form for digital audio representation in computers. In order to convert an analog signal to PCM, two steps are required.

  • sampling: the magnitude of the analog signal are sampled regularly at uniform intervals.
  • quantization: the value of each sample is rounded to the nearest value expressible by the bits allowed for each sample.

Two Basic Properties

Two basic properties determines how well a PCM sequence can represent the original signal.

  • sampling rate: the number of samples taken in a second
  • bit depth: the number of bits used to represent each sample, which determines the number of values each sample can take (e.g. 8 bits => 2^8 = 256 values)

PCM Types

  • Linear PCM: The straightforward method of PCM. The samples are taken linearly and represented on a linear scale (as opposed to Logarithmic PCM etc.). It is an uncompressed format, which can be compressed by different audio codec. When we talk about PCM, we’re generally referring to Linear PCM.
  • Logarithmic PCM: the amplitudes of samples are represented in logarithmic form. There are two major variants of log PCM, mu-law (u-law) and A-law.
  • Differential PCM (DPCM): sample value is encoded as difference from its previous sample value. This could reduce number of bits required for an audio sample.
  • Adaptive DPCM (ADPCM): the size of quantization step is varied so that the required bandwidth can be further reduced for a given signal-to-noise ratio.

Audio File Formats Support LPCM

LPCM audio is usually stored in aiff (.aiff, .aif, .aifc), wav (.wav, .wave), au (.au, .snd), and raw (.raw, .pcm) audio files.

An Intro to ELF File Format–Part 2 Sections and Segments

This is a follow up post of the Part 1 An Intro to ELF File Format.

Sections

We’ve seen the section headers in section header table above, now we discuss sections in detail.

Below are a list of commonly seen section types.  (Note that this is not a complete list)

  • NULL: this marks the section header as inactive. It does not have an associated section.
  • PROGBITS: program content, including code, data, debugging info etc.
  • SYMTAB and DYNSYM: the section holds a symbol table. SYMTAB provides symbols for link editing and dynamic linking. DYNSYM holds only a minimal set of dynamic linking symbols.
  • STRTAB: holds a string table.
  • RELA: relocation entries with explicit addends.
  • REL: relocation entries without explicit addends.
  • DYNAMIC: information for dynamic linking.
  • HASH: holds a symbol hash table for looking up the symbol in a ELF file quickly.
  • NOTE: information that marks the file.
  • NOBITS: like PROGBITS, but without occupying space in the file. Used for BSS data allocated at program load time.

We can see the section type from the section header screenshots above. For example, the .interp is a section of type PROGBITS.

Sections also have flags associated with them. Below is a list of commonly seen flags.

  • WRITE (W): the section contains data that is writable when loaded.
  • ALLOC (A): the section occupies memory when the program is loaded.
  • EXECINSTR (X): the section contains executable machine code.
  • MERGE (M): the data in the section may be merged to eliminate duplication.
  • STRINGS (S): the data in the section contain null-terminated character strings.

Below is a list of commonly seen sections.

Name Type Attributes Notes
.bss NOBITS ALLOC and WRITE holds uninitialized data. The system initializes the data with zeros before the program starts to run.
.comment PROGBITS MERGE and STRINGS some remarks about the file
.data PROGBITS ALLOC and WRITE holds initialized data.
.debug PROGBITS holds information for symbolic debugging
.dynamic DYNAMIC ALLOC (WRITE, processor specific) holds dynamic linking info.
.dynstr STRTAB ALLOC strings needed by dynamic linking, most commonly the names associated with dynamic linking symbol table entries
.dynsym DYNSYM ALLOC symbol table for dynamic linking
.fini PROGBITS ALLOC and EXEC process termination code
.got PROGBITS ALLOC and WRITE Global Offset Table (GOT).
.hash HASH ALLOC symbol hash table
.init PROGBITS ALLOC and EXEC process initialization code
.interp PROGBITS ALLOC on if file has a loadable segment that includes relocation path name of a program interpreter.
.note NOTE
.plt PROGBITS ALLOC and EXEC Procedure Linkage Table (PLT)
.rel<name> REL ALLOC on if file has a loadable segment that includes relocation relocation info, <name> indicates the section to which the relocations apply
.rela<name> RELA ALLOC on if file has a loadable segment that includes relocation relocation info, <name> indicates the section to which the relocations apply
.rodata PROGBITS ALLOC read-only data
.shstrtab STRTAB section names
.strtab STRTAB ALLOC on if the file has a loadable segment that includes the symbol string table strings. Most commonly the strings that represent the names associated with symbol table entries.
.symtab SYMTAB ALLOC on if file has a loadable segment that includes the symbol table symbol table.
.text PROGBITS ALLOC and EXEC executable code of a program

Segments

Executable and shared object ELF files statically represent programs. A system needs to use these files to create dynamic representation (process image) in order to execute the program. A process image consists of segments which holds text, data ,stack etc created from the segments in ELF files.

We can get various information about segments of ELF file from its program headers. Below is a screenshot of program headers in test executable file.

$ readelf -l test

Each record has several fields.

  • Type: the segment type.
  • Offset: the location of the first byte of the segment with respect to the beginning of the file
  • VirtAddr: the virtual address of the first byte of the segment in memory
  • PhysAddr: only make sure on systems where physical addressing is relevant.
  • FileSiz: segment size in number of bytes in file image of the segment
  • MemSiz: segment size in number of bytes in memory image of the segment.
  • Flg: flags
  • Align: memory alignment. Values 0 or 1 means no alignment is needed. Otherwise, VirtAddr%Align == Offset%Align

Below is a list of commonly seen segment types in ELF files.

  • NULL: unused
  • LOAD: a loadable segment.
  • DYNAMIC: dynamic linking info
  • INTERP: the segment specifies the path to an interpreter. If present, must proceed any loadable segment.
  • NOTE: location and size of auxiliary info.
  • PHDR: specifies the size and location of the program header itself, both in file image and memory image of the program.

A segment consists of one or more sections. The two most commonly seen segments are the text segment and the data segment (both are of LOAD type).

A text segment contains read-only instructions and data. It usually includes the following sections: .text, .rodata, .hash, .dynsym, .dynstr, .plt, .rel.got. In the screenshot above, the third entry (with LOAD type) is a text segment.

A data segment contains writable data and instructions. It usually includes the following sections: .data, .dynamic, .got, .bss. In the screenshot above, the fourth entry (with LOAD type) is a text segment.

References:

1. System V Application Binary Interface, Apr 2001 http://refspecs.linuxbase.org/elf/gabi4+/contents.html

2. The ELF Object File Format: Introduction. 1995. http://www.linuxjournal.com/article/1059?page=0,0

3. The ELF Object File Format by Dissection. 1995. http://www.linuxjournal.com/article/1060

An Intro to ELF File Format–Part 1 File Types and Dual Views

Executable and Linking Format (ELF) is the object format used in UNIX-like operating systems. This post gives brief introduction to ELF file format in the context of Linux.
We will use the readelf tool from Binutils to view the content of the ELF file of a “hello world” C program as below.

#include <stdio.h>
int main() {
   printf(“hello world!n”);
}

Save the code in a file named test.c, and then compile the source code into a *.o file and an executable with the commands below.

$ gcc test.c -c
$ gcc test.o -o test

The first command should produce a file named test.o, and the second command will output a file test.

File Types

Below are the four main types of ELF files we encountered during development.

  • Relocatable files: created by the compiler and assembler, usually ends with .o extension, and to be processed by the linker to produce the executable or library files.
  • Executable files: created by linker with all relocation done and all symbols resolved except for shared library symbols to be resolved at run time. It specifies how exec creates a program’s process image.
  • Shared object files: created by the linker, contains the symbol information and runnable code needed by the linker. It can be used by the linker to create another object file or used along with other shared objects and executable file by the dynamic linker to create a process image.
  • Core file: a core dump file.

We can check if a file is ELF file by the file command as shown below.

$ file test.o

$ file test

The file command only gives some brief information about the two files. We can use the readelf tool to obtain more info.

$ readelf -h test.o

$ readelf -h test

The -h option means to display the ELF file header. The Magic number is used to indicate the file is ELF file. From the Type field, we can easily tell which ELF file type it belongs to.

From a detailed explanation of the ELF header, we can refer to http://refspecs.linuxbase.org/elf/gabi4+/ch4.eheader.html.

Dual Views

ELF provides dual views of the file content, with the linking view to to facilitate program linking and execution view for program execution. This is shown in figure below.

In the linking view, the file content is seen as sections. The section header table gives information about the section name, type, address, etc. This view is used by the linker to build programs and libraries, therefore the files used in the linking process must has a section header. On the other hand, the execution view is used to create a process image and the file content is seen as segments. The program header table is mandatory.
We can view the section headers and program headers with readelf command.

$ readelf -S test.o

$ readelf -l test.o

There are no program headers in this file.

$ readelf -S test

$ readelf -l test

The -S option displays the section headers while the -l option displays the program headers. Note that the -l option also indicates the section to segment mapping as shown in the last screenshot above. For more detailed info about section headers, we can refer to http://refspecs.linuxbase.org/elf/gabi4+/ch4.sheader.html.

References:

1. System V Application Binary Interface, Apr 2001 http://refspecs.linuxbase.org/elf/gabi4+/contents.html

2. The ELF Object File Format: Introduction. 1995. http://www.linuxjournal.com/article/1059?page=0,0

3. The ELF Object File Format by Dissection. 1995. http://www.linuxjournal.com/article/1060

What is Android NDK–a File by File View

Android NDK is a collection of tools and libraries that allow us to develop Android apps in C/C++. This post dissect the latest Android NDK release r8c by providing detailed information about the files and directories.

build: the build tools and scripts

–awk: some awk programs used internally by NDK build system, mainly for parsing various files of an Android NDK project.

–core: mk files used internally by NDK build system.
–gmsl: GNU make standard library.  A collection of functions implemented using native GNU Make functionality that provide list and string manipulation, integer arithmetic, associative arrays, stacks, and debugging facilities.

–tools: a collection of tools (mainly scripts) used for NDK compilation, installation, etc. We don’t need to use these tools as a NDK app developer.

docs: the documentations
–ANDROID-ATOMICS.html: issues about Android atomic operations defined in sys/atomics.h

–ANDROID-MK.html: how Android.mk file works

–APPLICATION-MK.html: how Application.mk file works

–CHANGES.html: the NDK change log for all NDK versions.

–CPLUSPLUS-SUPPORT.html: C++ support on Android. Different runtimes are discussed.

–CPU-ARCH-ABIS.html: native code Application Binary Interface (ABI) management. Supported ABIs include arm, arm-v7a, x86 and mips.

–CPU-ARM-NEON.html: how to use the NEON features available on some arm cpu.

–CPU-FEATURES.html: how to detect CPU families and features using the cpu-features library.

–CPU-MIPS.html: how to enable MIPS support at Android NDK

–CPU-X86.html: how to enable x86 support at Android NDK

–DEVELOPMENT.html: how to modify, compile and generate release packages of NDK

–HOWTO.html: a collection of commonly used tips and tricks.

–IMPORT-MODULE.html: how to import a module outside of project source tree.

–INSTALL.html: how to install Android NDK

–LICENSES.html: license

–NATIVE-ACTIVITY.HTML: how to development and build native activity and application

–NDK-BUILD.html: how ndk-build command works

–NDK-GDB.html: how ndk-debug command works

–NDK-STACK.html: how to use the ndk-stack tool

–openmaxal:

—-index.html: Android specific doc about OpenMAX AL

—-OpenMAX_AL_1_0_1_Specification.pdf: the OpenMAX AL specification 1.0.1

–opensles

—-index.html: Android specific doc about OpenSL ES

—-OpenSL_ES_Specification_1.0.1.pdf: the OpenSL ES specification 1.0.1

–OVERVIEW.html: Android NDK overview

–PREBUILTS.html: How Android prebuilt library works

–sidenav.html: sidebar for the documentation, basically a menu

–STABLE-APIS.html: the list of stable APIs available at different Android API level.

–STANDALONE-TOOLCHAIN.html: how to use the NDK toolchain as a standalone compiler.

–system

—-libc

——CHANGES.html: Android Bionic C change log.

——OVERVIEW.html: Android Bionic C library overview.

——SYSV-IPC.html: explains why Android doesn’t support System V IPCs.

–SYSTEM-ISSUES.html: a list of known issues at Android NDK and Android system images.

documentation.html: the documentation html main page

GNUmakefile: a small script used to detect NDK path. It “includes” /build/core/main.mk. possibly the entry point of NDK build system.

ndk-build: the entry point for building NDK code

ndk-build.cmd: a Windows batch script to invoke NDK-specific GNU Make executables

ndk-gdb: the entry point for debugging NDK code

ndk-stack: a tool that allows us to trace the stack in a shared library with the logcat output. The details of this tool can be found at docs/NDK-STACK.html.

ndk-which: a tool to learn the path of the active toolchain components within the ndk.

platforms: The platform files, the libraries and header files for various Android API level at different platforms.
–android-3:

—-arch-arm: the logical root directory for libraries and headers at Android-3 ARM

–android-4:

—-arch-arm: the logical root directory for libraries and headers at Android-4 ARM

–android-5:

—-arch-arm: the logical root directory for libraries and headers at Android-5 ARM

–android-8:

—-arch-arm: the logical root directory for libraries and headers at Android-8 ARM

–android-9:

—-arch-arm: the logical root directory for libraries and headers at Android-9 ARM

—-arch-mips: the logical root directory for libraries and headers at Android-9 MIPS

—-arch-x86: the logical root directory for libraries and headers at Android-9 x86

–android-14:
—-arch-arm: the logical root directory for libraries and headers at Android-14 ARM
——usr:
——–include:
——–lib:
—-arch-mips: the logical root directory for libraries and headers at Android-14 MIPS
——usr:
——–include:
——–lib:
—-arch-x86: the logical root directory for libraries and headers at Android-14 x86
——usr:
——–include:
——–lib:
prebuilt: contains the prebuilt binaries required by the host and the target platform
–android-arm: some binaries used by NDK on arm target
—-gdbserver
——gdbserver: gdb server
–android-mips: some binaries used by NDK on MIPS target
—-gdbserver
——gdbserver: gdb server
–android-x86: some binaries used by NDK on x86 target
—-gdbserver
——gdbserver: gdb server
–linux-x86: some binaries used by the Android NDK build system
—-bin:
——awk: an interpreted programming language typically used as a data extraction and reporting tool.
——make: a tool which controls the generation of executables and other non-source files of a program from source files.
——sed: stream editor. A utility that parses text and applies transformations to text.

README.TXT: introduction for Android NDK

RELEASE.TXT: contains NDK version.

samples: contains a few sample NDK projects.
–bitmap-plasma: jnigraphics API
–hello-gl2: OpenGL ES v2.
–hello-jni: basic example for JNI
–hello-neon: how to use cpufeatures to detect CPU features and how to use NEON feature of ARM CPU
–module-exports: how to use a module in another module.
–native-activity: how to create nativity actitiy
–native-audio: OpenSL ES audio API
–native-media: OpenMAX AL media API
–native-plasma: native_app_glue
–san-angeles: OpenGL ES v1.
–test-libstdc++: build native executable
–two-libs: two libraries, where the second depends on the first one.

sources: some library modules and their source code
–android:

—-cpufeatures: a library that helps us detecting device CPU type and features. source code available.

—-libportable: Device Shared Library libportable. (???)

—-libthread_db: the sources of the special libthread_db that will be statically linked against the gdbserver binary. These are uses automatically by the build-gdbserver.sh script. This is not an import module.

—-native_app_glue: the android_native_app_glue module, used for creating a native activity.

–cpufeatures: the cpufeatures import module. just a link to android/cpufeatures.

–cxx-stl: various C++ runtime libraries, refer to docs/CPLUSPLUS-SUPPORT.html for more details.
—-gabi++: gabi++ C++ runtime
—-gnu-libstdc++: GNU C++ runtime
—-stlport:  stlport C++ runtime
—-system:  system default C++ runtime=

tests: scripts and sources to perform automated testing for NDK release
–awk: test files for awk
–build: contains tests used to check the NDK build system itself.
–check-release.sh: a few sanity checks on a given NDK release install/package.
–device: contains tests used to check that NDK-generated binaries work properly on an Android device.

–README: description about the folders under tests
–run-tests-all.sh: run all tests
–run-tests.sh: used to run NDK build tests. Without any parameters, this will try to run all standard tests.

–standalone: test programs and scripts for testing a standalone toolchain.

toolchains: toolchains for various platforms, we only show arm-linux-androideabi-4.6 in detail

–arm-linux-androideabi-4.6: toolchain used to compile for arm architecture android ABI on linux with gcc 4.6

—-config.mk: config file for the arm gcc-4.6 toolchain for the Android NDK

—-setup.mk: this file is used to prepare the NDK to build with the arm gcc-4.6 toolchain any number of source files. It defines (or re-defines) templates used to build various sources into target object files, libraries or executables.

—-prebuilt

——linux-x86

——–arm-linux-androideabi

———-lib:

————ldscripts:

———-bin: the tools in this directory is the same as the corresponding binaries (with arm-linux-androideabi- prefix) in the linux-x86/bin directory. including ar, as, c++, g++, gcc, ld, ld.bfd, ld.gold, nm, objcopy, objdump, ranlib, strip

——–bin:

———-arm-linux-androideabi-addr2line: Convert addresses into line number/file name pairs. e.g.: arm-linux-androideabi-addr2line -C -f –e obj/local/armeabi/libnativemaprender.so 0003deb4 (Binutils)

———-arm-linux-androideabi-ar: archiver, used to create modify and extract from libraries. It is normally used to create static libraries. (Binutils)

———-arm-linux-androideabi-as: assembler. (Binutils)

———-arm-linux-androideabi-c++: C++ front end, same as arm-linux-androideabi-g++ (gcc)

———-arm-linux-androideabi-c++filt: Filter to demangle encoded C++ symbols. (Binutils)

———-arm-linux-androideabi-cpp: C preprocessor. The C preprocessor implements the macro language used to transform C, C++, and Objective-C programs before they are compiled. (gcc)

———-arm-linux-androideabi-elfedit: examine and modify ELF metadata within an ELF object. (Binutils)

———-arm-linux-androideabi-g++: GCC compiler C++ front end (gcc)

———-arm-linux-androideabi-gcc: GCC compiler C front end (gcc)

———-arm-linux-androideabi-gcc-4.6: same as above. (gcc)

———-arm-linux-androideabi-gcov: program to test code coverage. (gcc)

———-arm-linux-androideabi-gdb: gdb debugger. (gdb)

———-arm-linux-androideabi-gdbtui: gdb text user interface. (gdb)

———-arm-linux-androideabi-gprof: display profiling info (Binutils)

———-arm-linux-androideabi-ld: linker. same as arm-linux-androideabi-ld.gold (Binutils)

———-arm-linux-androideabi-ld.bfd: linker using BFD, the Binary File Descriptor library.(Binutils)

———-arm-linux-androideabi-ld.gold: a new faster, ELF only linker. (Binutils)

———-arm-linux-androideabi-nm: Lists symbols from object files. (Binutils)

———-arm-linux-androideabi-objcopy: Copy and translates object files. (Binutils)

———-arm-linux-androideabi-objdump: Displays information from object files. (Binutils)

———-arm-linux-androideabi-ranlib: Generates an index to the contents of an archive. The index lists all the symbols defined by archive members that are relocatable object files. (Binutils)

———-arm-linux-androideabi-readelf: Displays information from any ELF format object file. (Binutils)

———-arm-linux-androideabi-run: for manipulating simulators.

———-arm-linux-androideabi-size:  Lists the section sizes of an object or archive file. (Binutils)

———-arm-linux-androideabi-strings: Lists printable strings from files. (Binutils)

———-arm-linux-androideabi-strip: remove symbols. (Binutils)

——–include:

———-lib:

————libiberty.a: contains routines used by various GNU programs, including getopt, obstack, strerror, strtol, and stroul

————libarm-elf-linux-sim.a:

————libarm-linux-android-sim.a:

————gcc:
————–arm-linux-androideabi:
—————-4.6:
——————libgcov.a: a library used by GCC compiler to support code coverage test.
——————libgcc.a: a library used by GCC compiler for some low-level computations.
——————gcov-src: the source code for libgcov
——————crtbegin.o: program initialization code. refer to http://gcc.gnu.org/onlinedocs/gccint/Initialization.html
——————crtbeginS.o: variant of crtbegin.o
——————crtbeginT.o: variant of crtbegin.o
——————crtend.o: program destruction code.
——————crtendS.o: variant of crtend.o
——————arm-v7a: files for arm-v7a ABI
——————thumb: files for thumb code
——–lib32:
———-libbfd.a: the binary file descriptor library. It is a package which allows applications to use the same routines to operate on object files of different object file formats. (Binutils)

———-libbfd.la:  libtool library file for libbfd.a

———-libintl.a: it is a library that provides native language (non-english) support to programs. It is a part of gettext.

———-libexec: some utilities and libraries used by GCC internally
——–gcc:
———-arm-linux-androideabi:
————4.6: some utilities and libraries used by GCC internally, including collect2, cc1, cc1plus etc.
——–SOURCES: description about the sources for the toolchain
——–sysroot: root directory for headers and libraries
———-usr: prefix

————lib: same as content platforms/android-<depends on the sysroot set when compiling the toolchain>/arch-arm/usr/lib/

————include: same as content in platforms/android-<depends on the sysroot set when compiling the toolchain>/arch-arm/usr/include/

–arm-linux-androideabi-4.4.3: toolchain used to compile for arm architecture android ABI on linux with gcc 4.4.3

–arm-linux-androideabi-clang3.1: toolchain used to compile for ARM architecture android ABI on linux with gcc 4.4.3 (contains the config.mk and setup.mk files only)

–llvm-3.1: contains the clang compiler from LLVM

–mipsel-linux-android-4.4.3: toolchain used to compile for MIPS architecture android ABI on linux with gcc 4.4.3

–mipsel-linux-android-4.6: toolchain used to compile for MIPS architecture android ABI on linux with gcc 4.6

–mipsel-linux-android-clang3.1: toolchain used to compile for MIPS architecture android ABI on linux with clang 3.1 (contains the config.mk and setup.mk files only)

–x86-4.4.3: toolchain used to compile for x86 architecture android ABI on linux with gcc 4.4.3

–x86-4.6:  toolchain used to compile for x86 architecture android ABI on linux with gcc 4.6

–x86-clang3.1:  toolchain used to compile for x86 architecture android ABI on linux with clang 3.1 (contains the config.mk and setup.mk files only)

Build Android NDK Toolchain From Source Code

Android NDK comes with a few toolchains under the toolchain directory. We can also build our own toolchain from the source code.

0. Download Latest Android NDK (r8c at the time of writing) from Android NDK website at http://developer.android.com/tools/sdk/ndk/index.html. Extract the downloaded archive.

$ tar xvf android-ndk-r8c-linux-x86.tar.bz2

1. Get into the ndk directory. Download the Android NDK source code to the src directory.

$ cd android-ndk-r8c/
$ ./build/tools/download-toolchain-sources.sh src

2. Install the following libraries

sudo apt-get install libncurses5-dev
sudo apt-get install texinfo
sudo apt-get install bison
sudo apt-get install flex

3. Rebuild the library is just one command. Below are two commands build two different toolchains with different versions of GCC and GDB.

./build/tools/build-gcc.sh –verbose –gdb-version=6.6 $(pwd)/src $(pwd) arm-linux-androideabi-4.6

./build/tools/build-gcc.sh –verbose –gdb-version=7.3.x $(pwd)/src $(pwd) arm-linux-androideabi-4.7

The toolchain built can be found under toolchains directory of the Android NDK folder.

Dumping Python Pickle Files from Java

Pickle is a powerful serializing and deserializing mechanism supported by Python. With Jython jar file, we can load and dump Python pickle files from Java. Loading pickle file from Java has already been covered in a previous post. This post will discuss how to dump data to a pickle file in Java so a python script can deserialize it.

0. Download Jython jar file and the sources.jar file from Jython website.  Suppose we are using Eclipse IDE, add the downloaded Jython jar file as a library. We can also set the sources.jar as the attached source for the jar library for easier debugging.

1. Dump the data to a pickle file ids.pkl. The following Java from demonstrates how to put the data in a Java HashSet to a PyList and dump it.

import java.io.File;

import java.io.FileNotFoundException;

import java.io.FileOutputStream;

import java.io.OutputStream;

import java.util.HashSet;

 

import org.python.core.PyFile;

import org.python.core.PyList;

import org.python.core.PyString;

import org.python.modules.cPickle;

 

 

 

public class DumpPickle {

    public static void main(String[] args) {

        HashSet<String> appIds = new HashSet<String>();

        appIds.add("1234321432");

        appIds.add("1234321433");

        appIds.add("xsydfsflkfds");

        DumpPickle afpd = new DumpPickle();

        afpd.dumpHashsetToPickle(appIds);

    }

    

    public void dumpHashsetToPickle(HashSet<String> pIds) {

        File f = new File("ids.pkl");

        OutputStream fs = null;

        try {

            fs = new FileOutputStream(f);

        } catch (FileNotFoundException e) {

            e.printStackTrace();

            return;

        }

        PyFile pyF = new PyFile(fs);

        PyList pyIdList = new PyList();

        for (String id : pIds) {

            PyString pyStr = new PyString(id);

            pyIdList.add(pyStr);

        }

        cPickle.dump(pyIdList, pyF);

        pyF.flush();

        pyF.close();

    }

 

}

Note that we will need to convert the Java data structure to Jython data structure before the dump operation.

2. Load the data from the pickle file ids.pkl. Below is a simple Python script that deserializes the data from ids.pkl to a list.

#!/usr/bin/python

import pickle

ids = []

ids = pickle.load( open( "ids.pkl", "r" ) )

for i in range(len(ids)):

    print ids[i]

3. Sample Execution

We can run the Java program first to produce the ids.pkl file, and then run the python script, which should print out the following.

1234321433
xsydfsflkfds
1234321432

As expected, we can get the list from the ids.pkl file.

Side note: the Jython jar file used for testing is jython-2.5.3.jar.

References:

Jython website: http://www.jython.org/