前言

这篇文章是我在操作系统原理课程的作业。作业选题本身很宽,大意是只要描述操作系统的发展即可。因为早就对 Plan 9 这个旨在取代 Unix 的操作系统有兴趣,又可以借此花时间仔细理解和实际使用一次这个系统,于是毫不犹豫地选择了这个题目。

以下就是我自己对 Plan 9 这个操作系统在设计上的理解和概括。

简介

Plan 9 是一款分布式操作系统,由贝尔实验室计算机科学研究中心在 19 世纪 80 年代中期开始开发,旨在成为 Unix 的后继者。Plan 9 的团队正是曾开发 Unix 和 C 语言的团队,开发过程中参与者包括 Ken Thompson、Dennis Ritchie、Bjarne Stroustrup 等人。

本文将主要介绍 Plan 9 的部分设计理念。

设计

文件

Plan 9 继承了 Unix 中“ 任何事物都是文件”的哲学,并且对其进一步扩展,将所有计算资源都视为文件进行管理,使用读取和写入操作作为与资源交互的统一方式。

例如,在 Unix 中对许多具体硬件控制需要使用 ioctl 系统调用进行,以及 X Window 等系统中对于资源也使用函数调用控制而非用文件代表,但在 Plan 9 中,无论是 CPU、外围设备、网络,还是图形界面本身,设计者始终采用文件来代表资源并与之交互。

在这个意义下,其实文件的意义并非传统的磁盘存储的单元,而是一个指代一般意义上计算资源的名字。

文件系统

Plan 9 与 Unix 同样采用的是层次性的文件系统。由于文件实际上就是资源,所以文件的路径实际上也与 URI 的意义类似,即计算资源可以通过特定的路径访问。

在文件的组织方式上,Plan 9 使用了与 Unix 类似约定的命名方式,例如也可以使用 ls /proc 查看所有进程等。

Plan 9 的文件系统并非传统意义上局限于磁盘的文件系统。在 Plan 9 中,由于资源即是文件,资源本身来自本地设备还是需要通过网络访问其实只是实现细节,所以两者在文件系统中具有统一的表示。Plan 9 采用一种名为 9P 的协议对资源的访问过程进行封装。经过这种抽象,系统中对远端和本机资源进行访问的接口达成了一致,由此通过分布式的文件系统从系统设计上实现了分布式计算的基础,例如即使是 CPU 资源也可以通过文件方便地共享。

命名空间

文件系统只是资源的组织形式,而允许不同程序对资源采取不同的组织形式可以带来很大的可扩展性,因此 Plan 9 对每个进程使用使用了不同的文件系统,称为命名空间,并由此实现了资源的进程隔离。

此外,父进程还可以以 9P 协议向子进程的命名空间提供虚拟文件(以文件作为服务),这样实现了一种低成本而原生的跨进程资源访问,也使得提供给子进程资源可以被自由地替换。由此,实现 VPN 连接只需要将 /net 目录代理,实现窗口系统只需要将 bitblt 文件代理,等等。

合并目录

因为要替换子进程文件系统中的目录,但目录中的内容又并非全部都需要重新提供,合并目录就成为了一个良好的解决方案。

Plan 9 中的合并目录是一种类似 Unix 中挂载和搜索路径相结合的做法。通过将目录甲合并至目录乙,访问目录乙时将会先在目录甲中进行查找,未找到时再在目录乙本身中查询。将这个设计与网络结合,则可以产生将远端的 /bin 目录合并至本地即可直接使用远端程序这样的使用方法。

UTF-8

Plan 9 更加为人知晓的一个组件则是由 Ken Thompson 提出的 UTF-8 编码。通过这种编码方式,Unicode 自动兼容于原先的 ASCII 字符,不会出现需要编码转换的问题,因此时至今日已经被广泛采用。而 UTF-8 本身则早已是 Plan 9 的原生编码。

发展

虽然 Plan 9 有着相对于 Unix 更加普适和可扩展的设计,但在实际使用中它并没有赢得太多的用户。

Plan 9 在 1992 年向大学公开,并在 1995 年由 AT&T 商业发布。然而 1996 年,AT&T 已将重心转移至 Inferno,一个基于 Plan 9 思想设计并且旨在与 Java 竞争的系统。到了 2000 年,掌管贝尔实验室的朗讯决定停止 Plan 9 的商业支持,并在几年之后将 Plan 9 的代码以自由软件许可的形式向大众公开。

评价

Plan 9 对 Unix 中的许多理念做了进一步的提升和扩展,对资源管理和分布式计算进行了优秀而统一的实现,在操作系统架构的历史上具有不可磨灭的意义;来自其中的 UTF-8、rc shell 等组件,也被其他操作系统所吸纳接受;并且,Plan 9 的后继者 Inferno 和 Plan B 还在继续发展。

但是,仅有优雅的设计,却缺乏相应的生态,这导致了 Plan 9 在实际应用中的结果不成功,令人惋惜。这一切正如 Eric Raymond 所说,足以让目标高远的软件工程师们时时警醒:

Plan 9 会失败,纯粹只是因为它的改进程度没有大到能取代 Unix。与 Plan 9 相比,Unix 虽然破破烂烂带有明显的缺陷,但还是能够把工作良好地完成,而这就足以保住它的地位了。

这件事情教给了那些怀有雄心壮志的系统架构师一个道理:一个更优解决方案所面临的最危险的敌人,其实是那些能把事情刚好完成的程序。

致谢

本文写作过程中参考了 Plan 9 from Bell Labs – Wikipedia, the free encyclopedia贝尔实验室九号计划 – 维基百科,自由的百科全书 等页面的内容,在此表示感谢。

Introduction

For content-oriented Android application, how and where to store the content to display is a issue every developer will be concerned with. The Android framework provided a comprehensive solution with ContentProvider (and a lot more) which suits quite well with a SQLite database; but in a lot of other scenarios, the app only need to have some information cached, while the majority of content is directly retrieved from network, thus eliminating the need for a database (and a complicated content scheme).

But when we look into the core of this issue, we will soon find out that it is in fact a problem of whether to have a central storage, and how to notify different components about a change.

Initial Scenario

Let’s look at the naive solution first, retrieving content from network and only keeping them in memory. In this case, no central storage is needed, and every component fetches its content independently, simple enough. But this approach immediately becomes flawed when it comes to content updating such as user modification. For instance, we have a list activity and a detail activity, and the user can often modified something in the detail activity, say, they clicked ‘like’, and the like button shall look activated not only in the detail activity, but also in the list activity now in background.

Content is fetched independently, so every component have no knowledge of whether someone else is holding the same content as itself, and that content will become stale when shown to user if the content this component newly retrieved is an updated version. Ideally speaking, any piece content is a single resource that should have only one state at point of time, and keeping it represented by one entity in memory is the best way to achieve this — yes, a central storage (whether in memory or also backed by disk), and the DRY (don’t repeat yourself) principle, a different story of the solution framework provided.

Anyway, we always need a notifying mechanism, instead of having some holder of a unique entity shared among clients, because clients may need to transform the content into something else instead of using the unique one directly, or they need notify others when content is changed.

ContentProvider Mechanism

The idea behind ContentProvider is that, ContentProvider is the only central storage and all its content is the only real entity that will always be up-to-date, while the content its clients hold is only a copy that will easily get out-of-date. So when a change happened, the central storage is refreshed, and ContentObservers are notified of the change and then they will query the central ContentProvider for an updated copy of content.

In this case, only the ContentProvider (or a delegated SyncAdapter) can fetch data from network; the clients can only request a fetch and wait for the change callback just as any other client. The central ContentProvider acts as a middle man between clients and network, to ensure the uniqueness of any content entity.

Go centralized?

But a principally-correct and comprehensive solution may not be the best solution for a specific type of scenario. We will face at least the following difficulties:

  1. Identifier: In order to ensure only one central entity per resource, We need URIs to uniquely identify resources just as its name suggests, so we must defining a content URI scheme. But things go complex quickly considering the complex logical hierarchy an interlinked content system will need.

  2. Action: We’ll lose the flexibility on handling changes for special cases. Think of infinite loading, liking and post deleting, we’ll need a generic mechanism to notify all the observers on an URI if some items are newly loaded, modified or removed, while framework’s ContentObserver only offers an onChanged() method. However if we handle this only for specific cases, it can be much more easier to implement.

  3. Releasing memory: Because most of the content is dynamic, there is little point in data persistence on disk. So we’ll store retrieved items only in memory, and then, we need to release them once nobody needs them. This can be tricky when, for instance, we have an observer on a collection, and another observer on a detail of this collection is removed, now whether the detailed content should be released depends on whether the collection observer actually observes on this detail, which cannot be inferred by the central content manager from URI scheme.

  4. JSON interoperability: The framework ContentProvider mechanism uses Cursor, where only basic types and blob are valid column type, making it uninteroperable with the widely used JSON approach. We’ll have to roll our own.

Implementing and using such a framework is a complex and heavy task, and complexity is error-prone, while it brings little advantage over the decentralized solution we are going to talk about.

Stay decentralized

So we want to stay on the track of not having any central storage, and allow duplicated (so possibly inconsistent) entities of content in memory.

This way, we are to do a sync among components once we any content is newly fetched from network in this case. The solution is much more specific to each scenario in this case: we can utilize some already present event bus system to get notified of updates, listen to specific event code of content change, and then respond accordingly.

On the first sight, this solution may seem not generic enough, lacking in the beauty of unity. However, considering the overall cost of using a centralized storage with URI scheme, I believe this decentralized-and-syncing mechanism is the way to go for applications in this scenario.

Conclusion

Different mechanisms apply to different scenarios. Android’s centralized ContentProvider mechanism fits for content that should be persisted and synced, while the decentralized mechanism works well with application with highly-dynamic content.