使用背景

由于Android的UI不是线程安全的，所以要求更新UI的操作都必须在主线程进行，否则就会抛出异常。但是Android又有另外一个限制，主线程不能执行耗时操作，否则会发生ANR，并且所有的网络请求都不能在主线程进行，所以通常对数据的耗时操作都是在子线程进行，处理完以后再通过Handler将更新UI的操作传递给主线程（UI）线程执行。一个简单的使用例子：

在主线程中创建Handler：

Handler handler = new Handler(new Handler.Callback() {
    @Override
    public boolean handleMessage(Message msg) {
        //处理接收到的消息
        return true;
    }
});

在子线程中通过主线程创建的Handler发送消息：

handler.post(new Runnable() {
    @Override
    public void run() {
        //do something
    }
});

//or
handler.sendMessage(msg);

这样即可做到将更新Ui的操作从子线程切换到主线程中执行。特别的如果需要在子线程中创建handler，需要先使用是Looper.prepare()创建好Looper对象，再调用Looper.loop()开启消息循环，否则创建Handler时会抛出异常。

消息循环的创建过程

Looper的创建

在非主线程使用消息机制时需要创建一个Looper对象，用于进行消息循环。Looper对象必须通过调用静态方法Looper.prepare()创建（Looper.prepareMainLooper()由Android系统调用，创建主线程Looper），该方法会判断当前线程是否已经创建过Looper对象，如果没有则会调用Looper的构造器创建该对象并使用ThreadLocal将其与当前线程“绑定”，否则将抛出异常：

public static void prepare() {
    prepare(true);
}

private static void prepare(boolean quitAllowed) {
    if (sThreadLocal.get() != null) {
        throw new RuntimeException("Only one Looper may be created per thread");
    }
    sThreadLocal.set(new Looper(quitAllowed));
}

/**
 * Initialize the current thread as a looper, marking it as an
 * application's main looper. The main looper for your application
 * is created by the Android environment, so you should never need
 * to call this function yourself.  See also: {@link #prepare()}
 */
public static void prepareMainLooper() {
    prepare(false);
    synchronized (Looper.class) {
        if (sMainLooper != null) {
            throw new IllegalStateException("The main Looper has already been prepared.");
        }
        sMainLooper = myLooper();
    }
}

public static @Nullable Looper myLooper() {
    return sThreadLocal.get();
}

在创建Looper时，会同时创建一个MessageQueue：

private Looper(boolean quitAllowed) {
    mQueue = new MessageQueue(quitAllowed);
    mThread = Thread.currentThread();
}

线程中最多只能有一个Looper存在，Android中使用了ThreadLocal类来实现这一需求。ThreadLocal从使用上可以理解为以线程作为key的Map。

MessageQueue的创建

MessageQueue用于存储消息循环中的“消息”，在Looper类中被创建，由于每个线程最多创建一个Looper，所以每个线程也最多只能创建一个MessageQueue：

MessageQueue(boolean quitAllowed) {
    mQuitAllowed = quitAllowed;
    mPtr = nativeInit();
}

这里调用了一个Native方法——nativeInit()，用于创建Native层的MessageQueue并将其地址保存在属性mPtr中，该属性是连接Java层消息进制与Native层消息机制的桥梁。

static jlong android_os_MessageQueue_nativeInit(JNIEnv* env, jclass clazz) {
    NativeMessageQueue* nativeMessageQueue = new NativeMessageQueue();
    if (!nativeMessageQueue) {
        jniThrowRuntimeException(env, "Unable to allocate native queue");
        return 0;
    }

    nativeMessageQueue->incStrong(env);
    return reinterpret_cast<jlong>(nativeMessageQueue);
}

NativeMessageQueue的创建

该类用于辅助Java层的MessageQueue，需要注意的是该类虽然叫NativeMessageQueue但不对Native层“消息”进行存储，其创建过程如下：

NativeMessageQueue::NativeMessageQueue() :
        mPollEnv(NULL), mPollObj(NULL), mExceptionObj(NULL) {
    mLooper = Looper::getForThread();   //获取当前线程Looper
    if (mLooper == NULL) {  //获取不到当前线程Looper时重新创建
        mLooper = new Looper(false);
        Looper::setForThread(mLooper);  //将创建的Looper与当前线程绑定
    }
}

Native层Looper的创建

该对象在NativeMessageQueue类中被创建，用于发送与处理Native层的“消息”。此外，该类利用Linux的管道(Pipe)实现了消息处理线程的唤醒与阻塞，即当消息处理线程没有消息处理时会通过读端文件描述符阻塞当前线程，直到其他线程通过写端文件描述符唤醒它。此处主要完成了以下两个工作：

创建一个管道，并保存该管道的读、写端文件描述符；
创建一个epoll实例，并将其与管道的读端文件描述符进行绑定

Looper::Looper(bool allowNonCallbacks) :
        mAllowNonCallbacks(allowNonCallbacks), mSendingMessage(false),
        mResponseIndex(0), mNextMessageUptime(LLONG_MAX) {
    int wakeFds[2];
    int result = pipe(wakeFds); //创建一个管道
    LOG_ALWAYS_FATAL_IF(result != 0, "Could not create wake pipe.  errno=%d", errno);
    mWakeReadPipeFd = wakeFds[0];	//获取管道读端文件描述符
    mWakeWritePipeFd = wakeFds[1];	//获取管道写端文件描述符
    result = fcntl(mWakeReadPipeFd, F_SETFL, O_NONBLOCK);
    LOG_ALWAYS_FATAL_IF(result != 0, "Could not make wake read pipe non-blocking.  errno=%d",
            errno);
    result = fcntl(mWakeWritePipeFd, F_SETFL, O_NONBLOCK);
    LOG_ALWAYS_FATAL_IF(result != 0, "Could not make wake write pipe non-blocking.  errno=%d",
            errno);
    
    // Allocate the epoll instance and register the wake pipe.
    mEpollFd = epoll_create(EPOLL_SIZE_HINT);   //创建一个epoll实例
    LOG_ALWAYS_FATAL_IF(mEpollFd < 0, "Could not create epoll instance.  errno=%d", errno);
    struct epoll_event eventItem;
    memset(& eventItem, 0, sizeof(epoll_event)); // zero out unused members of data field union
    eventItem.events = EPOLLIN;
    eventItem.data.fd = mWakeReadPipeFd;    //与管道读端文件描述符绑定
    result = epoll_ctl(mEpollFd, EPOLL_CTL_ADD, mWakeReadPipeFd, & eventItem);
    LOG_ALWAYS_FATAL_IF(result != 0, "Could not add wake read pipe to epoll instance.  errno=%d",
            errno);
}

消息循环过程

经过以上的初始化工作以后，便可以调用Looper.loop()方法开启消息循环。

Looper.loop()

此方法通过一个死循环不断地通过调用MessageQueue.next()方法取出“消息”，并交给发送该“消息”的Handler处理，如果目前没有需要处理的消息时调用MessageQueue.next()将阻塞当前线程。循环退出的唯一条件为MessageQueue.next()方法返回了null。

/**
 * Run the message queue in this thread. Be sure to call
 * {@link #quit()} to end the loop.
 */
public static void loop() {
    final Looper me = myLooper();
    if (me == null) {
        throw new RuntimeException("No Looper; Looper.prepare() wasn't called on this thread.");
    }
    final MessageQueue queue = me.mQueue;
    ... ...

    for (;;) {
        Message msg = queue.next(); // might block
        if (msg == null) {
            // No message indicates that the message queue is quitting.
            return;
        }
        ... ...

        msg.target.dispatchMessage(msg);
        ... ...
    }
}

MessageQueue.next()

在Looper.loop()中通过该方法取出消息循环中存储的“消息”，当没有消息需要立即处理时会进行睡眠等待。主要的流程为：

取出当前消息队列的第一个消息；
如果该消息是同步栅栏（msg.target == null），则丢弃之后的同步消息，直到遇到一个异步消息，并取出。
如果取出的消息为null，则设置nextPollTimeoutMillis = -1 并直接进行第5步；
判断取出的消息是否需要立即处理（now >= msg.when），需要则直接返回该消息，否则设置nextPollTimeoutMillis = msg.when - now
如果有空闲消息，则通过设置的IdleHandler进行处理，并设置nextPollTimeoutMillis = 0
将nextPollTimeoutMillis的值传递给nativePollOnce()方法进行处理：
- 值为-1时：一直阻塞当前线程，直到消息队列收到新的消息
- 值为0时：不阻塞当前线程
- 值大于0时：阻塞当前线程，直到经过nextPollTimeoutMillis指定的时间
回到第一步。

Message next() {
    // Return here if the message loop has already quit and been disposed.
    // This can happen if the application tries to restart a looper after quit
    // which is not supported.
    final long ptr = mPtr;
    if (ptr == 0) {
        return null;	//当消息循环退出结束时直接返回
    }

    int pendingIdleHandlerCount = -1; // -1 only during first iteration
    int nextPollTimeoutMillis = 0;
    for (;;) {
        if (nextPollTimeoutMillis != 0) {
            Binder.flushPendingCommands();	//在线程睡眠前处理正在等待的Binder进程间通信请求
        }

        nativePollOnce(ptr, nextPollTimeoutMillis);

        synchronized (this) {
            // Try to retrieve the next message.  Return if found.
            final long now = SystemClock.uptimeMillis();
            Message prevMsg = null;
            Message msg = mMessages;
            if (msg != null && msg.target == null) {
                // Stalled by a barrier.  Find the next asynchronous message in the queue.
                do {
                    prevMsg = msg;
                    msg = msg.next;
                } while (msg != null && !msg.isAsynchronous());
            }
            if (msg != null) {
                if (now < msg.when) {
                    // Next message is not ready.  Set a timeout to wake up when it is ready.
                    nextPollTimeoutMillis = (int) Math.min(msg.when - now, Integer.MAX_VALUE);
                } else {
                    // Got a message.
                    mBlocked = false;
                    if (prevMsg != null) {
                        prevMsg.next = msg.next;
                    } else {
                        mMessages = msg.next;
                    }
                    msg.next = null;
                    if (DEBUG) Log.v(TAG, "Returning message: " + msg);
                    msg.markInUse();
                    return msg;
                }
            } else {
                // No more messages.
                nextPollTimeoutMillis = -1;
            }

            // Process the quit message now that all pending messages have been handled.
            if (mQuitting) {
                dispose();
                return null;
            }

            // If first time idle, then get the number of idlers to run.
            // Idle handles only run if the queue is empty or if the first message
            // in the queue (possibly a barrier) is due to be handled in the future.
            if (pendingIdleHandlerCount < 0
                    && (mMessages == null || now < mMessages.when)) {
                pendingIdleHandlerCount = mIdleHandlers.size();
            }
            if (pendingIdleHandlerCount <= 0) {
                // No idle handlers to run.  Loop and wait some more.
                mBlocked = true;
                continue;
            }

            if (mPendingIdleHandlers == null) {
                mPendingIdleHandlers = new IdleHandler[Math.max(pendingIdleHandlerCount, 4)];
            }
            mPendingIdleHandlers = mIdleHandlers.toArray(mPendingIdleHandlers);
        }

        // Run the idle handlers.
        // We only ever reach this code block during the first iteration.
        for (int i = 0; i < pendingIdleHandlerCount; i++) {
            final IdleHandler idler = mPendingIdleHandlers[i];
            mPendingIdleHandlers[i] = null; // release the reference to the handler

            boolean keep = false;
            try {
                keep = idler.queueIdle();
            } catch (Throwable t) {
                Log.wtf(TAG, "IdleHandler threw exception", t);
            }

            if (!keep) {
                synchronized (this) {
                    mIdleHandlers.remove(idler);
                }
            }
        }

        // Reset the idle handler count to 0 so we do not run them again.
        pendingIdleHandlerCount = 0;

        // While calling an idle handler, a new message could have been delivered
        // so go back and look again for a pending message without waiting.
        nextPollTimeoutMillis = 0;
    }
}

MessageQueue.nativePollOnce()

该方法为native方法，只是单纯调用了nativeMessageQueue的pollOnce()方法：

static void android_os_MessageQueue_nativePollOnce(JNIEnv* env, jobject obj,
        jlong ptr, jint timeoutMillis) {
    NativeMessageQueue* nativeMessageQueue = reinterpret_cast<NativeMessageQueue*>(ptr);
    nativeMessageQueue->pollOnce(env, obj, timeoutMillis);
}

NativeMessageQueue.pollOnce()

同样该方法也只是简单调用了Looper的pollOnce()方法：

void NativeMessageQueue::pollOnce(JNIEnv* env, jobject pollObj, int timeoutMillis) {
    mPollEnv = env;
    mPollObj = pollObj;
    mLooper->pollOnce(timeoutMillis);
    mPollObj = NULL;
    mPollEnv = NULL;

    if (mExceptionObj) {
        env->Throw(mExceptionObj);
        env->DeleteLocalRef(mExceptionObj);
        mExceptionObj = NULL;
    }
}

Looper.pollOnce()

在该方法中，通过一个死循环不断调用pollInner()方法检查是否有新消息，直到pollInner()方法返回且返回值不为0：

int Looper::pollOnce(int timeoutMillis, int* outFd, int* outEvents, void** outData) {
    int result = 0;
    for (;;) {
        ... ...
            
        if (result != 0) {
			... ...
                
            return result;
        }

        result = pollInner(timeoutMillis);
    }
}

Looper.pollInner()

该方法通过调用epoll_wait()函数来对epoll实例中的文件描述符的IO事件进行监听。如果没有IO事件发生，则当前线程将会阻塞，阻塞时间由timeoutMillis决定。

从`epoll_wait()`函数中返回后，通过遍历的方式查找发生IO事件的文件描述符。如果发生IO事件的文件描述符是当前线程所关联的管道的读端文件描述符（`fd == mWakeReadPipeFd`）且事件类型为`EPOLLIN`，则说明有其他线程向与当前线程所关联的一个管道写入了数据（即其他线程向当前线程发送了一个消息）。

int Looper::pollInner(int timeoutMillis) {
    ... ...
        
    int result = ALOOPER_POLL_WAKE;
    ... ...
        
    struct epoll_event eventItems[EPOLL_MAX_EVENTS];
    int eventCount = epoll_wait(mEpollFd, eventItems, EPOLL_MAX_EVENTS, timeoutMillis);
    ... ...
        
    for (int i = 0; i < eventCount; i++) {
        int fd = eventItems[i].data.fd;
        uint32_t epollEvents = eventItems[i].events;
        if (fd == mWakeReadPipeFd) {
            if (epollEvents & EPOLLIN) {
                awoken();
            }
            ... ...
        }
        ... ...
    }
    ... ...
        
    return result;
}

其他线程将当前线程唤醒后，为了没有消息处理时能继续通过epoll_wait()进行睡眠等待，需要读出当前管道中的旧数据：

void Looper::awoken() {
    char buffer[16];
    ssize_t nRead;
    do {
        nRead = read(mWakeReadPipeFd, buffer, sizeof(buffer));
    } while ((nRead == -1 && errno == EINTR) || nRead == sizeof(buffer));
}

消息发送过程

在消息机制中Handler扮演着发送与处理消息的角色，通过Handler.sendMessage()可以方便的在其他线程向主线程发送一个消息，并交给主线程处理。

Handler.sendMessage()

该方法主要做了两个事情：

将当前Handler对象保存在Message.target字段中，方便处理消息时通过Message回调Handler.dispatchMessage()方法
通过MessageQueue,enqueueMessage()方法将Message添加到消息队列中

public boolean sendMessageAtTime(Message msg, long uptimeMillis) {
    MessageQueue queue = mQueue;
    if (queue == null) {
        RuntimeException e = new RuntimeException(
                this + " sendMessageAtTime() called with no mQueue");
        Log.w("Looper", e.getMessage(), e);
        return false;
    }
    return enqueueMessage(queue, msg, uptimeMillis);
}


private boolean enqueueMessage(MessageQueue queue, Message msg, long uptimeMillis) {
    msg.target = this;
    if (mAsynchronous) {
        msg.setAsynchronous(true);
    }
    return queue.enqueueMessage(msg, uptimeMillis);
}

MessageQueue.enqueueMessage()

从这里可以看到，虽然名字叫消息队列，但是实际上是以链表的形似存储的消息，当插入新消息时会通过when字段将消息插入到链表合适的位置。有两种情况值得注意：

新消息插入到链表头部
新消息插入到链表中间

对于第一种情况，由于头部发生了变化，如果主线程处于阻塞状态（mBlocked == true），那么需要通过调用nativeWake()方法唤醒主线程，以便让主线程处理新消息。

对于第二种情况，只有当链表头部是同步栅栏，并且当前消息是最早的异步消息时才会调用nativeWake()方法唤醒主线程（同样主线程需要处于阻塞状态），否则不会唤醒主线程

boolean enqueueMessage(Message msg, long when) {
    if (msg.target == null) {
        throw new IllegalArgumentException("Message must have a target.");
    }
    if (msg.isInUse()) {
        throw new IllegalStateException(msg + " This message is already in use.");
    }

    synchronized (this) {
        if (mQuitting) {
            IllegalStateException e = new IllegalStateException(
                    msg.target + " sending message to a Handler on a dead thread");
            Log.w(TAG, e.getMessage(), e);
            msg.recycle();
            return false;
        }

        msg.markInUse();
        msg.when = when;
        Message p = mMessages;
        boolean needWake;
        if (p == null || when == 0 || when < p.when) {
            // New head, wake up the event queue if blocked.
            msg.next = p;
            mMessages = msg;
            needWake = mBlocked;
        } else {
            // Inserted within the middle of the queue.  Usually we don't have to wake
            // up the event queue unless there is a barrier at the head of the queue
            // and the message is the earliest asynchronous message in the queue.
            needWake = mBlocked && p.target == null && msg.isAsynchronous();
            Message prev;
            for (;;) {
                prev = p;
                p = p.next;
                if (p == null || when < p.when) {
                    break;
                }
                if (needWake && p.isAsynchronous()) {
                    needWake = false;
                }
            }
            msg.next = p; // invariant: p == prev.next
            prev.next = msg;
        }

        // We can assume mPtr != 0 because mQuitting is false.
        if (needWake) {
            nativeWake(mPtr);
        }
    }
    return true;
}

MessageQueue.nativeWake()

这里这是简单调用了NativeMessageQueue.wake()方法：

static void android_os_MessageQueue_nativeWake(JNIEnv* env, jclass clazz, jlong ptr) {
    NativeMessageQueue* nativeMessageQueue = reinterpret_cast<NativeMessageQueue*>(ptr);
    nativeMessageQueue->wake();
}

NativeMessageQueue.wake()

这里同样只是简单调用了Looper.wake()方法：

1
2
3

void NativeMessageQueue::wake() {
    mLooper->wake();
}

Looper.wake()

该方法会在之前创建的管道中写入一个字符，这时目标线程就会因为这个管道发生了一个IO事件而被唤醒：

void Looper::wake() {
    ssize_t nWrite;
    do {
        nWrite = write(mWakeWritePipeFd, "W", 1);
    } while (nWrite == -1 && errno == EINTR);
    if (nWrite != 1) {
        if (errno != EAGAIN) {
            ALOGW("Could not write wake signal, errno=%d", errno);
        }
    }
}

消息处理过程

Handler.dispatchMessage()

在Looper.loop())方法中，取出消息后会调用Handler.dispatchMessage(msg)方法将Message传递给Handler进行分发。分发过程如下：

如果Message上设置有回调（通常为通过Handler.postXxx()发送的消息），则直接分发给它处理；
如果Handler设置有Callback回调，则交给Callback.handleMessage()方法处理；
如果没有设置回调，或者Callback.handleMessage()方法不处理该消息（返回false），则交给Handler.handleMessage(msg)处理。

public void dispatchMessage(Message msg) {
    if (msg.callback != null) {
        handleCallback(msg);
    } else {
        if (mCallback != null) {
            if (mCallback.handleMessage(msg)) {
                return;
            }
        }
        handleMessage(msg);
    }
}

private static void handleCallback(Message message) {
    message.callback.run();
}

其他

epoll机制

https://blog.csdn.net/u010657219/article/details/44061629

同步栅栏

通常情况下Message.target不能为空，但是有一种被称为SyncBarrier的消息是例外，它的target == null。消息循环在处理消息时如果遇到了一个SyncBarrier消息，则之后的所有同步消息都不会再进行分发，直到遇到一个异步消息（参见：MessageQueue.next())）。可以通过MessageQueue.postSyncBarrier()方法添加一个SyncBarrier消息到消息队列中。

同步消息与异步消息

默认情况下创建的Handler发送的都为同步消息，当使用带有async参数的构造器并将其设置为true时，创建的Handler发送的消息即为异步消息。同步消息和异步消息的差异只体现在有同步栅栏出现的情况下，否则没有其他区别。

IdleHandler

当消息队列没有需要马上处理的消息时会使用IdleHandler处理空闲消息，可以通过此机制将一些不太重要的工作交给IdleHandler处理。合理使用IdleHandler能在某些方面提高性能。

Handler内存泄漏

通常情况下，为了方便会使用这样的方式创建一个Handler并重写dispatchMessage()方法对消息进行处理：

Handler handler = new Handler() {
    @Override
    public void dispatchMessage(Message msg) {
        super.dispatchMessage(msg);
        //todo handle message
    }
};

但是以上做法（特别是在Activity中）会有内存泄漏的风险。因为在Java中，内部类会持有对外部对象的引用，该Handler对象又会被它发送的消息所引用，当消息被加入到消息队列后就会使外部对象一直无法被回收，直到通过该Handler发送的所有消息都被处理回收，或者整个消息循环使用完毕被回收。

解决方法为：通过静态内部类的方式创建Handler，并通过弱引用的方式对外部Activity进行引用。

public static class SafeHandler extends Handler {
    private final WeakReference<Activity> mActivity;

    public SafeHandler(Activity activity) {
        mActivity = new WeakReference<>(activity);
    }

    @Override
    public void handleMessage(Message msg) {
        Activity activity = mActivity.get();
        super.handleMessage(msg);
        if (activity != null) {
            //todo handle message
        }
    }
}

同样，使用以下方式发送消息的情况也比较常见，但是同样会造成内存泄漏，原因与解决方法同上：

handler.postDelayed(new Runnable() {
    @Override
    public void run() {
        //do something
    }
}, 5 * 60 * 1000);

参考书籍：《Android系统源代码情景分析》