GradientDrawable on Android

Drawable represents something that can be drawn on screen on Android. There’re many Drawable types. This article covers GradientDrawable, which can also be defined in xml with <shape> element.

Confusingly there’s also ShapeDrawable, which the Android documentation and training guide claims to be defined in xml with <shape> element. This is not right and one can look at ShapeDrawable source code to confirm that.

The following screenshot captures what can be done with GradientDrawable. The source can be found here.

Figure 1. Drawings Generated by Android GradientDrawable


And the table below shows some elements are not available for certain shapes.

Element\Shape Line Rectangle Oval Ring
corners NA NA NA
Linear gradient NA
Radial gradient NA
Sweep gradient NA
solid NA

We’ll first introduce the four shapes, and then go through some of the important xml elements.


There’s not much use when shape is line, because lots of elements are not supported and all the effects we want to draw with line shape can be done with rectangle.

One thing to note that is when specifying a dashed stroke, it won’t work properly due to bug here. Suppose we’re drawing the line on ImageView, the following code is needed.

 imageView.setLayerType(View.LAYER_TYPE_SOFTWARE, null);


Rectangle is used to draw a rectangle-like shape. It’s the only shape can have corners elements, which makes it very powerful. The corners element allows one to specify the radius at four corners of the rectangle.

One can use rectangle shape to draw a line by specifying a small height, or draw a circle by specifying proper corner radius.


Oval can be used to draw a circle or oval.


Ring is used to draw a ring with specified thickness. innerRadius, innerRadiusRatio, thickness, thicknessRatio, useLevel attributes only applies when the shape attribute is ring.

innerRadius is used to specify the width of the inner hole of the ring. innerRadiusRatio is to specify the innerRadius as the ratio of the ring’s width. For example, innerRadiusRatio=”5” means innerRadius is equal to the ring’s width divide by 5.

Thickness defines the thickness of the ring. thicknessRatio is similar to innerRadiusRatio.


There are three gradient types supported, namely linear, radial and sweep. One can refer to figure 1 for the drawing can be created by them.

All gradients must specify startColor and endColor. centerColor is optional and its relative position is also configurable via centerX and centerY (both in range of (0, 1.0)).


A linear gradient, which specifies a starting gradually transforms into ending color. It is also the default gradient. Linear gradient has a special attribute named angle. It must be a multiple of 45 and in the range of [0, 315].

Figure 2. Angle of Linear Gradient


Radial defines a gradient in a fan out fashion. It has a special attribute gradientRadius, which must be defined.


Padding is applied to the containing View element (this pads the position of the View content, not the shape).

For example, if this oval is set to the background of an ImageView, and the paddings will be applied to position the image resource (view content) of the image view.


Defines the shape’s width and height. The shape scales to the size of the container View proportionate to the dimensions defined here by default. When you use the shape in an ImageView, you can restrict scaling by setting the android:scaleType to “center”.


  1. GradientDrawable Android doc
  2. Android drawable resource shape

Choose the Right Clock on Android

Android framework provides three different time keeping facilities. This post discusses the main differences and usage of them.


It returns the number of milliseconds elapsed since the epoch, which is 00:00:00 January 1, 1970 UTC.

Note that this clock is not guaranteed to be monotonic, because of the followings:

  • The clock can be changed by network time sync
  • The clock can be updated by user or program

The clock can jump backwards or forwards unpredictably. We can listen for ACTION_TIME_TICK,ACTION_TIME_CHANGED and ACTION_TIMEZONE_CHANGED Intent broadcasts to find out when the time changes.

When to use

  • Get the wall clock time
  • Inerval/elapsed time that needed to span across device reboot. This is not ideal, but a cheap and OK solution sometimes. Ideally we want to have a server provided timestamp to avoid all the unpredictably jump can happen to this clock.

When not to use

  • Interval or elapsed timing doesn’t need to span across device reboot. E.g.: memory cache expiration time


It returns the number of milliseconds since device boot. The clock stops when system enters deep sleep. The clock is guaranteed to be monotonic.

This clock is the basis for Thread.sleep(millls), Object.wait(millis), and System.nanoTime().

When to use

  • Interval or elapsed timing doesn’t need to span across device reboot, and we want to exclude device deep sleep


It returns the number of milliseconds since device boot. The clock don’t stop even when system enters deep sleep. It’s also guaranteed to be monotonic.

elapsedRealtimeNanos(): Similar to method above, but returns number of nano seconds.

When to use

  • Interval or elapsed timing doesn’t need to span across device reboot, and we want to include device deep sleep


1. When using System.nanoTime or elapsedRealtimeNanos, differences span greater than approximately 292 years (263 nanoseconds) will not be computed correctly due to numerical overflow.

2. When comparing two timestamps, we should use t2 – t1 > 0 instead of t2 > t1, because of possible numerical overflow.

In order to help us understand why t2 – t1 is better than t2 > t1 in case of numerical overflow, let’s assume that t1 and t2 are bytes within the range of [-128, 127].
When t1 = 126, t2 = t1 + 3 will result in numerical overflow, and t2 = -127. If we use t2 > t1, it will return false. But t2 – t1 = -127 – 126, which will again result in numerical overflow, and the result is 3.

This only works when the difference of t2 and t1 itself can be represented by the data type. In the byte example above, t2 – t1 must be less than or equal to 127.

ArrayMap and its friends in Android

Previous post we discussed about SparseArray and its friends in Android. While they can replace HashMap when integer and long are used as key. It doesn’t work when the key is an arbitrary object.

Actually similar ideas can be applied to create more memory efficient generic map data structure. This post will go through ArrayMap, ArraySet and their friends, which replaces HashMap and HashSet in a more memory efficient way.


ArrayMap implements Map<K, V> interface, and it’s fully compatible with HashMap. There’a simplified version of ArrayMap: SimpleArrayMap, which doesn’t have the burden of maintaining compatibility with Java Map inteface, but still provides most of the commonly needed methods.

Correspondingly there’s ArraySet implements the Set<V> interface.

We should consider using ArrayMap and ArraySet at the following cases.

1. The number items is less than 1000.
2. Frequent reading, insertion and removal are less frequent.

And if we don’t need to maintain compatibility with Java interface, we can use SimpleArrayMap where ArrayMap is used.


We have described how HashMap works internally at SparseArray and its friends in Android. Basically it uses an array of linked list, and lots of unused space is allocated in the array.

For ArrayMap, two arrays are used. One is used to store the hash code of all keys, and the other is used to store the key value pairs. Key is stored at the even index, and value is saved at odd index.

Figure 1. Internal arrays in ArrayMap [image from ref 1]

Binary search is applied on the hash code array to locate the index, and the same index is used in the second array to store and retrieve key and value. Specifically,

1. For get, the hash code of key is computed and used in binary search to find the index. Then two searches are performed to find the equal key in the second array. One is searching forward, and the other is search backward. If no equal key is found, null is returned.

2. For put, if an equal key is found, the value is updated; otherwise a new key, value pair is inserted at the end of the equal hash code range. Array resizing might be applied in the process.

3. ArrayMap internally caches the arrays of size 4 and 8, to avoid spamming garbage. This further improves the memory efficiency when number of items in the map is small.

Advantages and Disadvantages

The internals of ArrayMap offers the following advantages compared with HashMap.

1. It’s more memory efficient. There’s no unused space allocated.

2. When there’s no element, ArrayMap doesn’t allocate memory, whereas HashMap allocates unused space.

3. Iteration with keyAt and valueAt is just like iterating an array. It’s very efficient. HashMap uses Iterator, which is slower and takes more memory.

But these advantages are not free,

1. Accessing a value from ArrayMap is O(logn), whereas in HashMap is O(1). This makes not much difference when number of elements in the Map is less than 1000.

2. Insertion and Deletion are relatively expensive, since we need to shift array elements, or allocate new array and copy elements over.


1. Android Fun with ArrayMaps performance video.

SparseArray and its friends in Android

One most important factor affecting Android app performance is memory. Android framework provides some speical classes to help developers optimizating memory usage. SparseArray and its friends are some of them.

How to use

SparseArray and its friends are designed to replace HashMap in Android when there isn’t large number of elements in the map. Below we list outs the Java usage and its corresponding Android replacement.

SparseArray          <Integer, Object>
SparseBooleanArray   <Integer, Boolean>
SparseIntArray       <Integer, Integer>
SparseLongArray      <Integer, Long>
LongSparseArray      <Long, Object>
LongSparseLongArray  <Long, Long>

Note that Android support library have two classes named SparseArrayCompat and LongSparseArray for backward compatibility. SparseArrayCompat supports methods like removeAtRange, which is only available at SparseArray from API 19.

When to use

In a word, we should use SparseArray and its friends when both of the following are met,
1. The key of the map is integer or long.
2. The number of elements is less than several hundreds.

This is because of the following advantages offered by SparseArray.
1. The key is primitive type, so putting an element doesn’t require creating an aditional Entry object, while HashMap does.
2. Also because key is primitive, there’s no auto-boxing of keys.

According to Android Memories presentation, for holding 1000 elements, SparseArray uses 8k while HashMap uses 64k.

class HashMap<k, v="">{ //  Class = 12 + 8 * 4 = 48 bytes
    Entry<k, v="">[] table; // Entry = 32 + 16 + 16 = 64 bytes
    Entry<k, v="">[] forNull; //  Array = 20 + 1000 * 64 = 64024 bytes
    int size;
    int modCount;
    int threshold;
    Set keys;
    Set<entry<K,V>> entries; // Total = 64,136 bytes
    Collection values;

class SparseIntArray { //  Class = 12 + 3 * 4 = 24 bytes
    int[] keys;  //  Array = 20 + 1000 * 4 = 4024 bytes
    int[] values;
    int size; //  Total = 8.072 bytes

But the memory efficiency doesn’t come for free.
1. HashMap computes the hash code to find the index to insert a new element in O(1). SparseArray uses binary search to find the index to insert in O(logn), which is less time efficient. According to Android official documentation, this difference is less than 50% for up to hundreds of items.
2. SparseArray is Android special, meaning less portable.

In addition, SparseArray offers some additional features.
1. The keys put into SparseArray are sorted ascendingly.
2. We can get an individual key or value given an index.

In contrast, HashMap doesn’t guarantee any order of the map. In fact, the order can change over time.


The pros and cons of SparseArray vs HashMap is due to their implementation.

For SparseArray, keys and values are kept in separate arrays.

1. Adding an element will use a binary search to find the insertion position.
a. If there’s already an element for key, replace it.
b. If there’s not and insertion space is available because the element there is marked as DELETED, insert it there.
c. Otherwise, we need to move array elements and grow the array capacity if necessary, and then insert the element.

2. Remove an element is optimized. The class simply mark it as DELETED. And when we query the size, the array needs to grow size, or do anything required to have a correct index, the array is then actually compacted.

Note that the method to compact the array is called gc, though it doesn’t necessarily will cause Java runtime GC to happen.

3. Because keys and values are kept in separate arrays and inserted using binary search position, the keys are guaranteed to be sorted.

HashMap is implemented using an array of linked list.

1. Adding an element will cause the class to compute the hash value of the key object (another hash is done on top of the hash code to further avoid collision). Based on the hash code, an array index is found to insert the value.

At each array index, there’s a linked list of Entry object, with key, value, hash and a pointer to next Entry object. When the same key (same reference or equals) is used, the old value is updated. Otherwise, a new Entry object is created, and added to the linked list.

2. Remove an element is similar. Finding the index based on hash in the array and traversal the linked list to find the correct Entry object to remove.

For a class with good hashcode implementation, the number of collisions will be small and the linked list at each array index will be short.

Android LayerDrawable and Drawable.Callback

LayerDrawable is a Drawable that manages an array of Drawables, where each Drawable is a layer of it. It can be used to compose fancy visual effects. However, used incorrectly, it can also introduce difficult to catch bugs. This post discusses a bug that can be caused by the Callback mechanism of Drawable.

Chain of Callbacks

In LayerDrawable, every layer/drawable registers the LayerDrawable as its Drawable.callback. This allows the layer to inform the LayerDrawable when it needs to redraw. As shown in the invalidateSelf method from, the DrawableCallback.invalidateDrawable is called to inform the client (the LayerDrawable, in this case) the drawable needs to redraw.

public void invalidateSelf() {
    final Callback callback = getCallback();
    if (callback != null) {


Also, a View registers itself as Drawable.callback, so when the drawable needs to redraw, the View can be informed and invalidated. So if we set background of a View to be LayeredDrawable, we have a chain of DrawableCallbacks. This is illustrated in the figure below.


Figure 1. Chain of Callbacks when Setting LayerDrawable as View’s Background

When the Drawable needs to redraw, it calls its callback.invalidateDrawable, which in turns calls LayerDrawable’s callback.invalidateDrawable, which causes the View to redraw the corresponding region if necessary.

View Breaks Callback of Old Background Drawable

So far all work well, but the setBackgroundDrawable of has the following code snippet.

public void setBackgroundDrawable(Drawable background) {

     * Regardless of whether we're setting a new background or not, we want
     * to clear the previous drawable.
    if (mBackground != null) {

    if (background != null) {

This means setting a new background will break the old background drawable’s callback by setting it to null, regardless whether the callback is still set to the View.

The Bug

With all the knowledge above, we can “fabricate” a bug by follow the steps below.

  1. Set a Drawable A as a background of a View V. The A.callback is set to V.

  2. Create a LayerDrawable L with A as one layer. Now A.callback is set to L.

  3. Now set another Drawable (or just null) as background of V. V will set callback of its old background (A in our example) to null, which breaks the link between A and L.

  4. Updates A won’t trigger L to update.

The solution to this issue is to update the background of V before creating L. This is illustrated in the code example here.

In the example, we tried animating the actionbar background. The background consists of two layers, the first one is a green ColorDrawable, and second a bitmap drawable. We want to animate the alpha value of the bitmap drawable, so it appears the image is faded in.

The code contains two methods, one works and the other doesn’t due to the bug we described. Each method can be triggered by a button. Below we extract the key part of the code, one can refer to the link for more details.

Button btn1 = (Button) findViewById(;
btn1.setOnClickListener(new View.OnClickListener() {
  public void onClick(View v) {
    // 1. This sets launcherIconDrawable.callback to actionBar
Button btn2 = (Button) findViewById(;
btn2.setOnClickListener(new View.OnClickListener() {
  public void onClick(View v) {
    // 1. This sets launcherIconDrawable.callback to actionBar

private void animateActionBarNotWorking() {
	Drawable[] layers = new Drawable[] { colorLayer, launcherIconDrawable };
	// 2. This sets launcherIconDrawable.callback to layerDrawable
	LayerDrawable layerDrawable = new LayerDrawable(layers);
	// 3. This sets launcherIconDrawable.callback to null
	ValueAnimator valueAnimator = ValueAnimator.ofInt(0, 255);
	valueAnimator.addUpdateListener(new ValueAnimator.AnimatorUpdateListener() {
	  public void onAnimationUpdate(ValueAnimator animation) {
		// 4. Updates launcherIconDrawable will not trigger action bar background to update
		// as launcherIconDrawable.callback is null
		launcherIconDrawable.setAlpha((Integer) animation.getAnimatedValue());

private void animateActionBarWorking() {



  1. LayerDrawable:

Support Vector Machine Concept: VC Dimension


Given a set of n samples $x_1, x_2, …, x_n$, we want to label them as either -1 or 1. In total, there are $2^n$ possible label combinations.

A class of learning machines H can be used to label the samples. If for each label combination, we can always find a learning machine $h in H$ that labels it correctly, we then say that H shatters n points.

VC (Vapnik-Chervonekis) dimension is then defined as the maximum number of points that can be shattered by H, which measures the capacity of the hypothesis class H.

Note that VC(H) = 4 does not mean H can shatter any 4 points in the hyperplane, as long as there’s 4 points can be shattered by H, it’s good enough. And it also infers that for any > 4 (say 5) points in the hyperplane, it cannot be shattered by H.

Now consider a simple example to find the VC dimension of a class of hypothesis H. H is defined as straight line in a two dimensional hyperplane.

We can find the following 3 points a, b and c where all $2^3$ possible label combinations can be classified correctly by H.


As mentioned before, it’s possible to find 3 points that H cannot shatter. For example, we just place a, b and c in the same line. There will be combinations that a straight line cannot label them correctly. But this doesn’t affect the VC dimension as it’s defined as the maximum number of points can be shattered.

One can try separate 4 points using a straight line, but no matter where we place those 4 points, there will be combinations that cannot be separated.


Naive Bayes

Bayes’ Theorem

Let’s start from Bayes’ theorem, also referred as Bayes’ law or Bayes’ rule.

P(A|B) = P(B, A) / P(B)

= P(B|A) * P(A) / P(B)

= P(B|A) * P(A) / (P(B|A) * P(A) + P(B|^A) * P(^A))

P(A): prior probability. It’s the probability event A happens.

P(^A): the probability that event A not happen.

P(B): evidence, or background. The probability of event B happens.

P(B|A), P(B|^A): conditional probability, or likelihood. The probability of event B happens given A happened or not happened respectively.

P(A|B): posterior probability. The probability of A happens taking into account B for and against A.

Naive Bayes

When used for classification, Bayes’ Theorem can be expressed as below,

P(C|F1, F2, … Fn) = P(C, F1, F2, …, Fn) / P(F1, F2, … , Fn)

= P(F1, F2, … Fn|C) * P(C) / P(F1, F2, …, Fn)

C is some class/label we can classify a sample into, and F1, F2, … Fn represents features of the sample data.

P(F1, F2, …, Fn) doesn’t depend on C and are normally given or can be calculated based on probability of each feature. It’s effectively a constant and can be ignored for classification purpose.

The numerator can be expressed as following,

P(C, F1, F2 … , Fn)

= P(C) * P(F1, F2, … , Fn|C)

= P(C) * P(F1 | C) * P(F2, F3, … Fn | F1, C)

= P(C) * P(F1 | C) * P(F2 | F1, C) * P(F3, … Fn | F1, F2, C)

= P(C) * P(F1 | C) * P(F2 | F1, C) * P(F3 | F1, F2, C) * …  * P(Fn | F1, F2, …, Fn-1, C)

In Naive Bayes, all features are assumed to be independent. Thus Fi is independent from every other feature Fj where j != i. Therefore we have

P(F2 | F1, C) = P(F2 | C)

P(F3 | F1, F2, C) = P(F3 | C)

P(Fn | F1, F2, … Fn-1, C) = P(Fn | C)


P(C, F1, F2 … , Fn) = P(C) * P(F1 | C) * P(F2 | C) * P(F3 | C), …, P(Fn | C)

For example, two authors A and B like to use words “love”, “life” and “money”. The probability of these words appears in A’s article is 0.1, 0.1 and 0.8, and in B’s as 0.5, 0.3 and 0.2. Now we have the phrase “love life”, which one of the author is more likely to have written that?

Without any information, there’s 50% percent probability for either A or B. Assuming the words are independent features, we can use Naive Bayes.

P(A | love, life) = P(A) * P(love | A) * P(life | A) / P(love, life) = 0.5 * 0.1 * 0.1 / P(love, life)

P(B | love, life) = P(B) * P(love | B) * P(life | B) / P(love, life) = 0.5 * 0.5 * 0.3 / P(love, life)

Clearly, it’s more likely that the phrase “love life” is written by author B. Note that P(love, life) is independent from the authors and just a scaling factor.


  1. Bayes’ theorem:
  2. Naive Bayes classifier:
  3. Udacity, Intro to Machine Learning. Lesson 1. Naive Bayes.

Performance Metrics for Binary Classification

Binary classification classifies samples as either 0 (negative) or 1(positive). Depending on which class/label the sample data belongs to and its predication result, each sample fits in a cell of the following table.

predicted | actual positive negative
positive true positive (TP) false positive (FP)
negative false negative (FN) true negative (TN)

We can then calculates different metrics, which measures the classification results from different angles.

True Positive Rate (Sensitivity, hit rate, recall): Out of all the positive samples, what fraction is actually detected as positive.

TPR = TP / (TP + FN)

True Negative Rate (Specificity): Out of all the negative samples, what fraction is actually detected as negative.

TNR = TN / (TN + FP)

Positive Predictive Value (Precision): Out of all samples predicted as positive, what fraction is actually positive.

PPV=TP / (TP + FP)

Negative Predictive Value: Out of all samples predicted as negative, what fraction is actually negative.

NPV = TN / (TN + FN)

False Positive Rate (Fall out): Out of all negative samples, what fraction is detected as positive by mistake.

FPR = FP / (FP + TN) = 1 – TNR

False Discovery Rate: Out of all samples predicted as positive, what fraction is actually negative.

FDR = FP / (FP + TP) = 1 – PPV

Accuracy: Out of all samples, what fraction is predicted correctly. That is, positive samples are predicted as positive, negative samples are predicted as negative.

Accuracy = (TP + TN) / (P + N)

The table below gives a nice overall view of the metrics mentioned above,

predicted | actual

positive negative


true positive (TP) false positive (FP)

PPV = TP / (TP + FP)


false negative (FN) true negative (TN)

NPV = TN / (FN + TN)

TPR = TP / (TP + FN) TNR = TN / (FP  + TN)

Accuracy =

(TP + TN) / (P + N)

F1 score: the harmonic mean of precision and recall.

F1 =2 * TPR*PPV / (TPR + PPV) = 2TP / (2TP + FP + FN)

Matthews correlation coefficient: It takes account true and false positives and negatives, and is regarded as balanced measure. It can be used for cases where the number of samples at different classes vary drastically.

MCC = (TP * TN – FP * FN) / √(TP + FP)(TP + FN)(TN + FP)(TN + FN)

MCC returns a value between -1 to 1, where -1 indicates total disagreement between prediction and actual facts, 0 means no better than random guess, and 1 indicates perfect prediction.


  1. Matthews correlation coefficient:
  2. Sensitivity and specificity:

Ghost Object Causing Crash in Java

A Bad Code Style

Java’s GC has made our life easier, but it also makes things more implicit and sometimes easier to make mistakes. Recently I encountered a bad code style which is easy to introduce bugs.

Imagining you have a class with a reference field and a method. The method creates a new object and assign it to the reference. Every time you call the method, a new object is created and assigned to the reference. The code is essentially like below,

class A {
    SomeClass B;
    void initB() {
         B = new SomeClass();
         …… do something with B … …

Just call initB many times, this pattern creates a lot of objects without referencing to them. Since we’re not referencing to them, GC will do us the favor of garbage collecting them. However, things can easily go wrong.

When we do something with B after creation of the object, the object can be referred at somewhere else (e.g.: B is a task, and do something adds it to a system task queue). Now we call initB many times, we’ll end up with lots of objects somewhere and only one referred by B. We may not have control over the non-referenced objects any more. This results in garbage.

Things can go worse. If SomeClass defines some callback method, which is triggered later by some signal. We may ends up having callback method from many instances of SomeClass.

An Example in Android

Below is an example in Android.

package com.example.androidtest;

import android.os.Bundle;
import android.os.Handler;
import android.util.Log;

public class MainActivity extends Activity {

  private static final String TAG = "MainActivity";
  private Handler handler;
  private Runnable runnable;
  private Integer value;
  protected void onCreate(Bundle savedInstanceState) {
    value = 10;
  private void startRunnable() {
    handler = new Handler();
    runnable = new Runnable() {
      public void run() {
        Log.d(TAG, value.toString());
    handler.postDelayed(runnable, 2000);
  private void removeRunnable() {
    value = null;


The startRunnable method creates a new Handler and Runnable objects, assigns them to the class reference field. We called this method twice, leaving a Handler and Runnable unreferenced by us.

However, these objects are still referenced by some Android system components. The postDelayed method essentially adds the runnable to a message queue associated with the current thread (The actual details are more complicated than this but you get the idea).

We then call removeRunnable to remove the runnable associated with handler, and assign null to value.

However, this doesn’t remove the ghost runnable associated with the ghost handler. And the run() method of the ghost runnable will be triggered, which will cause a crash when it tries to access value.toString().

The Fix

This coding style is bad and easy to introduce bugs. We essentially only want a single instance of Handler and Runnable at any time. If we have to create a new instance, we should ensure the old one is discarded properly, and leaving them referenced by system components is not.

If the system provides some callback functions that are guaranteed to be called once and we can create the objects inside those functions.

If we need to create objects inside some random methods, we can either reuse the old object, or properly reset the old object before creating new one. The code below shows how to do that to fix the example above.

package com.example.androidtest;

import android.os.Bundle;
import android.os.Handler;
import android.util.Log;

public class MainActivityNew extends Activity {

  private static final String TAG = "MainActivityNew";
  private Handler handler;
  private Runnable runnable;
  private Integer value;
  protected void onCreate(Bundle savedInstanceState) {
    value = 10;
  private void startRunnable() {
    if (handler == null) {
      handler = new Handler();
    } else {
    runnable = new Runnable() {
      public void run() {
        Log.d(TAG, value.toString());
    handler.postDelayed(runnable, 2000);
  private void removeRunnable() {
    value = null;


In startRunnable method, if a Handler instance is available, we’ll just reuse it; If there’re potentially old Runnable associated with handler, we’ll remove them.

Logistic Regression

This is some notes taken when I summarize the things learned after taking Andrew Ng’s machine learning course at coursera.


Linear regression predicts continuous values. At times, we need to categorize things. Logistic regression is a probabilistic statistical classification model does that.

We will examine how logistic regression classify things to two categories (either 0 or 1) first, and then how it is used for multiple categories.

The logistic regression model can be described by the following logistic/sigmoid function below,


h(x) an be interpreted as the estimated probability that y = 1 on input x.

If theta’X >= 0, h(x) >= 0.5, we predict output y = 1

If theta’X < 0, h(x) < 0.5, we predict output y = 0

theta’X essentially describes the decision boundary.  Note that we can use  other values instead of 0.5 as the cutoff point if it is more suitable.

Cost Function

The cost function for logistic regression is defined as below,


The cost is further defined as,


We can merge the functions, and the cost function eventually becomes


With regularization, the cost function becomes,

6Note that j starts from 1 as a convention.

Gradient Descent

The gradient descent of logistic regression is identical to linear regression, except that h(x(i)) is different.

Multi-class Classification: One-vs-All

We can use one-vs-all technique to apply logistic regression to multi-class classification. The idea is to train a logistic regression classifier for each class i to predict the probability that y = i. Then we pick the category that has the maximum probability for an input.