Android的VPN底层代码探幽
写minivtun_android的时候,为了把minivtun作为全局vpn(路由设为0.0.0.0/0,::/0),同时还把minivtun自己发起的流量排除在外,原计划使用VpnService.protect函数的,最后发现并不好使,最后是通过VpnService.Builder.AddDisallowApplications把自身id加入解决的(这个方法更简单,当初怎么没 想 搜到!)。
为了解答protect不生效的这个疑问,我决定读一读android的vpn service底层代码,同时了解一下它是如何设置路由规则的(如何让指定应用走或不走vpn)。
以前在Linux命令行下,我是通过设定minivtun的fwmark参数,然后使用ip rule指定fwmark走不同的route table的,android会有什么不同么?
以下就是代码追踪的过程,结果比我想象的要复杂。但从最后的实现方式来看,android的实现与命令行并没有差别,最终还是得依靠 fwmark和ip rule来实现。
disallowedApplications的实现
指定app绕过vpn,需要先把app转成uid,然后(远程调用)通知netd进程,后者负责完成具体的操作。
frameworks/base/services/core/java/com/android/server/connectivity/Vpn.java
/**
* Triggers an update of the VPN network's excluded UIDs if a VPN is running.
*/
public synchronized void refreshPlatformVpnAppExclusionList() {
updateAppExclusionList(getAppExclusionList(mPackage));
}
private synchronized void updateAppExclusionList(@NonNull List<String> excludedApps) {
// Re-build and update NetworkCapabilities via NetworkAgent.
if (mNetworkAgent != null) {
// Only update the platform VPN
if (isIkev2VpnRunner()) {
mConfig.disallowedApplications = List.copyOf(excludedApps);
mNetworkCapabilities = new NetworkCapabilities.Builder(mNetworkCapabilities)
.setUids(createUserAndRestrictedProfilesRanges(
mUserId, null /* allowedApplications */, excludedApps))
.build();
setVpnNetworkPreference(getSessionKeyLocked(),
createUserAndRestrictedProfilesRanges(mUserId,
mConfig.allowedApplications, mConfig.disallowedApplications));
doSendNetworkCapabilities(mNetworkAgent, mNetworkCapabilities);
}
}
}
private void setVpnNetworkPreference(String session, Set<Range<Integer>> ranges) {
BinderUtils.withCleanCallingIdentity(
() -> mConnectivityManager.setVpnDefaultForUids(session, ranges));
}
/**
* Creates a {@link Set} of non-intersecting {@code Range<Integer>} objects including all UIDs
* associated with one user, and any restricted profiles attached to that user.
*
* <p>If one of {@param allowedApplications} or {@param disallowedApplications} is provided,
* the UID ranges will match the app list specified there. Otherwise, all UIDs
* in each user and profile will be included.
*
* @param userId The userId to create UID ranges for along with any of its restricted
* profiles.
* @param allowedApplications (optional) List of applications to allow.
* @param disallowedApplications (optional) List of applications to deny.
*/
@VisibleForTesting
Set<Range<Integer>> createUserAndRestrictedProfilesRanges(@UserIdInt int userId,
@Nullable List<String> allowedApplications,
@Nullable List<String> disallowedApplications) {
packages/modules/Connectivity/framework/src/android/net/ConnectivityManager.java
/**
* Inform the system that this VPN session should manage the passed UIDs.
*
* A VPN with the specified session ID may call this method to inform the system that the UIDs
* in the specified range are subject to a VPN.
* When this is called, the system will only choose a VPN for the default network of the UIDs in
* the specified ranges.
*
* This method declares that the UIDs in the range will only have a VPN for their default
* network, but does not block the UIDs from accessing other networks (permissions allowing) by
* explicitly requesting it with the {@link Network} API.
* Compare {@link #setRequireVpnForUids(boolean, Collection)}, which does not affect what
* network the UIDs get as default, but will block them from accessing non-VPN networks.
*
* @param session The VPN session which manages the passed UIDs.
* @param ranges The uid ranges which will treat VPN as their only default network.
*
* @hide
*/
@RequiresPermission(anyOf = {
NetworkStack.PERMISSION_MAINLINE_NETWORK_STACK,
android.Manifest.permission.NETWORK_STACK,
android.Manifest.permission.NETWORK_SETTINGS})
@SystemApi(client = MODULE_LIBRARIES)
public void setVpnDefaultForUids(@NonNull String session,
@NonNull Collection<Range<Integer>> ranges) {
Objects.requireNonNull(ranges);
final UidRange[] rangesArray = getUidRangeArray(ranges);
try {
mService.setVpnNetworkPreference(session, rangesArray);
} catch (RemoteException e) {
throw e.rethrowFromSystemServer();
}
}
packages/modules/Connectivity/service/src/com/android/server/ConnectivityService.java
/**
* Sets the specified UIDs to get/receive the VPN as the only default network.
*
* Calling this will overwrite the existing network preference for this session, and the
* specified UIDs won't get any default network when no VPN is connected.
*
* @param session The VPN session which manages the passed UIDs.
* @param ranges The uid ranges which will treat VPN as the only preferred network. Clear the
* setting for this session if the array is empty. Null is not allowed, the
* method will use {@link Objects#requireNonNull(Object)} to check this variable.
* @hide
*/
@Override
public void setVpnNetworkPreference(String session, UidRange[] ranges) {
Objects.requireNonNull(ranges);
enforceNetworkStackOrSettingsPermission();
final UidRange[] sortedRanges = UidRangeUtils.sortRangesByStartUid(ranges);
if (UidRangeUtils.sortedRangesContainOverlap(sortedRanges)) {
throw new IllegalArgumentException(
"setVpnNetworkPreference: Passed UID ranges overlap");
}
mHandler.sendMessage(mHandler.obtainMessage(EVENT_SET_VPN_NETWORK_PREFERENCE,
new VpnNetworkPreferenceInfo(session,
new ArraySet<UidRange>(Arrays.asList(ranges)))));
}
public void handleMessage(Message msg) {
switch (msg.what) {
case EVENT_SET_VPN_NETWORK_PREFERENCE:
handleSetVpnNetworkPreference((VpnNetworkPreferenceInfo) msg.obj);
break;
private void handleSetVpnNetworkPreference(VpnNetworkPreferenceInfo preferenceInfo) {
Log.d(TAG, "handleSetVpnNetworkPreference: preferenceInfo = " + preferenceInfo);
mVpnNetworkPreferences = mVpnNetworkPreferences.minus(preferenceInfo.getKey());
mVpnNetworkPreferences = mVpnNetworkPreferences.plus(preferenceInfo);
removeDefaultNetworkRequestsForPreference(PREFERENCE_ORDER_VPN);
addPerAppDefaultNetworkRequests(createNrisForVpnNetworkPreference(mVpnNetworkPreferences));
// Finally, rematch.
rematchAllNetworksAndRequests();
}
private void updateVpnUidRanges(boolean add, NetworkAgentInfo nai, Set<UidRange> uidRanges) {
int[] exemptUids = new int[2];
// TODO: Excluding VPN_UID is necessary in order to not to kill the TCP connection used
// by PPTP. Fix this by making Vpn set the owner UID to VPN_UID instead of system when
// starting a legacy VPN, and remove VPN_UID here. (b/176542831)
exemptUids[0] = VPN_UID;
exemptUids[1] = nai.networkCapabilities.getOwnerUid();
UidRangeParcel[] ranges = toUidRangeStableParcels(uidRanges);
maybeCloseSockets(nai, ranges, exemptUids);
try {
if (add) {
mNetd.networkAddUidRangesParcel(new NativeUidRangeConfig(
nai.network.netId, ranges, PREFERENCE_ORDER_VPN));
} else {
mNetd.networkRemoveUidRangesParcel(new NativeUidRangeConfig(
nai.network.netId, ranges, PREFERENCE_ORDER_VPN));
}
} catch (Exception e) {
loge("Exception while " + (add ? "adding" : "removing") + " uid ranges " + uidRanges +
" on netId " + nai.network.netId + ". " + e);
}
maybeCloseSockets(nai, ranges, exemptUids);
}
private void maybeCloseSockets(NetworkAgentInfo nai, UidRangeParcel[] ranges,
int[] exemptUids) {
if (nai.isVPN() && !nai.networkAgentConfig.allowBypass) {
try {
mNetd.socketDestroy(ranges, exemptUids);
} catch (Exception e) {
loge("Exception in socket destroy: ", e);
}
}
}
protected ConnectivityService(Context context, IDnsResolver dnsresolver,
IpConnectivityLog logger, INetd netd, Dependencies deps) {
mNetd = netd;
public ConnectivityService(Context context) {
this(context, getDnsResolver(context), new IpConnectivityLog(),
INetd.Stub.asInterface((IBinder) context.getSystemService(Context.NETD_SERVICE)),
new Dependencies());
}
/**
* Adds or removes one rule for each supplied UID range to prohibit all network activity outside
* of secure VPN.
*
* When a UID is covered by one of these rules, traffic sent through any socket that is not
* protected or explicitly overriden by the system will be rejected. The kernel will respond
* with an ICMP prohibit message.
*
* Initially, there are no such rules. Any rules that are added will only last until the next
* restart of netd or the device.
*
* @param add {@code true} if the specified UID ranges should be denied access to any network
* which is not secure VPN by adding rules, {@code false} to remove existing rules.
* @param uidRanges a set of non-overlapping, contiguous ranges of UIDs to which to apply or
* remove this restriction.
* <p> Added rules should not overlap with existing rules. Likewise, removed rules should
* each correspond to an existing rule.
*
* @throws ServiceSpecificException in case of failure, with an error code corresponding to the
* unix errno.
*/
@Override public void networkRejectNonSecureVpn(boolean add, android.net.UidRangeParcel[] uidRanges) throws android.os.RemoteException
{
}
/** Administratively closes sockets belonging to the specified UIDs. */
@Override public void socketDestroy(android.net.UidRangeParcel[] uidRanges, int[] exemptUids) throws android.os.RemoteException
{
system/netd/server/NetdNativeService.cpp
binder::Status NetdNativeService::socketDestroy(const std::vector<UidRangeParcel>& uids,
const std::vector<int32_t>& skipUids) {
ENFORCE_NETWORK_STACK_PERMISSIONS();
SockDiag sd;
if (!sd.open()) {
return binder::Status::fromServiceSpecificError(EIO,
String8("Could not open SOCK_DIAG socket"));
}
UidRanges uidRanges(uids);
int err = sd.destroySockets(uidRanges, std::set<uid_t>(skipUids.begin(), skipUids.end()),
true /* excludeLoopback */);
if (err) {
return binder::Status::fromServiceSpecificError(-err,
String8::format("destroySockets: %s", strerror(-err)));
}
return binder::Status::ok();
binder::Status NetdNativeService::networkAddUidRanges(
int32_t netId, const std::vector<UidRangeParcel>& uidRangeArray) {
// NetworkController::addUsersToNetwork is thread-safe.
ENFORCE_NETWORK_STACK_PERMISSIONS();
int ret = gCtls->netCtrl.addUsersToNetwork(netId, UidRanges(uidRangeArray),
UidRanges::SUB_PRIORITY_HIGHEST);
return statusFromErrcode(ret);
}
}
system/netd/server/NetworkController.cpp
int NetworkController::addUsersToNetwork(unsigned netId, const UidRanges& uidRanges,
int32_t subPriority) {
ScopedWLock lock(mRWLock);
Network* network = getNetworkLocked(netId);
if (int ret = isWrongNetworkForUidRanges(netId, network)) {
return ret;
}
return network->addUsers(uidRanges, subPriority);
}
class SockDiag {
// Destroys all "live" (CONNECTED, SYN_SENT, SYN_RECV) TCP sockets for the given UID ranges.
int destroySockets(const UidRanges& uidRanges, const std::set<uid_t>& skipUids,
bool excludeLoopback);
protect的实现
要让指定socket绕过vpn,需要通知fwmark服务器。
VPNService
frameworks/base/core/java/android/net/VpnService.java
/**
* Protect a socket from VPN connections. After protecting, data sent
* through this socket will go directly to the underlying network,
* so its traffic will not be forwarded through the VPN.
* This method is useful if some connections need to be kept
* outside of VPN. For example, a VPN tunnel should protect itself if its
* destination is covered by VPN routes. Otherwise its outgoing packets
* will be sent back to the VPN interface and cause an infinite loop. This
* method will fail if the application is not prepared or is revoked.
*
* <p class="note">The socket is NOT closed by this method.
*
* @return {@code true} on success.
*/
public boolean protect(int socket) {
return NetworkUtilsInternal.protectFromVpn(socket);
}
frameworks/base/core/java/com/android/internal/net/NetworkUtilsInternal.java
/**
* Protect {@code socketfd} from VPN connections. After protecting, data sent through
* this socket will go directly to the underlying network, so its traffic will not be
* forwarded through the VPN.
*/
public static native boolean protectFromVpn(int socketfd);
frameworks/base/core/jni/com_android_internal_net_NetworkUtilsInternal.cpp
static jboolean android_net_utils_protectFromVpn(JNIEnv *env, jclass clazz, jint socket) {
return (jboolean)!protectFromVpn(socket);
}
static jboolean android_net_utils_protectFromVpnWithFd(JNIEnv *env, jclass clazz, jobject javaFd) {
return android_net_utils_protectFromVpn(env, clazz, AFileDescriptor_getFd(env, javaFd));
}
static const JNINativeMethod gNetworkUtilMethods[] = {
{"setAllowNetworkingForProcess", "(Z)V",
(void *)android_net_utils_setAllowNetworkingForProcess},
{"protectFromVpn", "(I)Z", (void *)android_net_utils_protectFromVpn},
{"protectFromVpn", "(Ljava/io/FileDescriptor;)Z",
(void *)android_net_utils_protectFromVpnWithFd},
};
system/netd/client/NetdClient.cpp
extern "C" int protectFromVpn(int socketFd) {
CHECK_SOCKET_IS_MARKABLE(socketFd);
FwmarkCommand command = {FwmarkCommand::PROTECT_FROM_VPN, 0, 0, 0};
return FwmarkClient().send(&command, socketFd, nullptr);
}
LIBC
android的libc实现是bionic,在这里对socket的基本Accept/Connect/Send等操作进行拦截,不是直接调用syscall,而是指向了libnetd_client.so的实现。从而实现更复杂的功能。
libc的accept函数默认使用__accept4(应该是syscall版本),通过使用Dispatch结构以允许进行替换。
bionic/libc/bionic/NetdClientDispatch.cpp
extern "C" __socketcall int __accept4(int, sockaddr*, socklen_t*, int);
extern "C" __socketcall int __connect(int, const sockaddr*, socklen_t);
...
extern "C" __socketcall int __socket(int, int, int);
__LIBC_HIDDEN__ NetdClientDispatch __netdClientDispatch __attribute__((aligned(32))) = {
__accept4,
__connect,
...
__socket,
...
int accept4(int fd, sockaddr* addr, socklen_t* addr_length, int flags) {
return FDTRACK_CREATE(__netdClientDispatch.accept4(fd, addr, addr_length, flags));
}
int connect(int fd, const sockaddr* addr, socklen_t addr_length) {
return __netdClientDispatch.connect(fd, addr, addr_length);
}
...
int socket(int domain, int type, int protocol) {
return FDTRACK_CREATE(__netdClientDispatch.socket(domain, type, protocol));
}
启动时进行初始化,指向了libnetd_client.so的实现。
bionic/libc/bionic/NetdClient.cpp
static void netdClientInitImpl() {
// Prevent netd from looping back fwmarkd connections to itself. It would work, but it's
// a deadlock hazard and unnecessary overhead for the resolver.
if (getuid() == 0 && strcmp(basename(getprogname()), "netd") == 0) {
async_safe_format_log(ANDROID_LOG_INFO, "netdClient",
"Skipping libnetd_client init since *we* are netd");
return;
}
void* handle = dlopen("libnetd_client.so", RTLD_NOW);
if (handle == nullptr) {
// If the library is not available, it's not an error. We'll just use
// default implementations of functions that it would've overridden.
return;
}
netdClientInitFunction(handle, "netdClientInitAccept4", &__netdClientDispatch.accept4);
netdClientInitFunction(handle, "netdClientInitConnect", &__netdClientDispatch.connect);
...
static pthread_once_t netdClientInitOnce = PTHREAD_ONCE_INIT;
extern "C" __LIBC_HIDDEN__ void netdClientInit() {
if (pthread_once(&netdClientInitOnce, netdClientInitImpl)) {
async_safe_format_log(ANDROID_LOG_ERROR, "netdClient", "Failed to initialize libnetd_client");
}
}
NetdClient
替换标准库accept的实现,连接Fwmark Server给socket设置Fwmark标志, Fwmark包括了NetId等信息。 使用union结构,方便读取原值和按字段覆盖。
union Fwmark {
uint32_t intValue;
struct {
unsigned netId : 16;
bool explicitlySelected : 1;
bool protectedFromVpn : 1;
Permission permission : 2;
bool uidBillingDone : 1;
};
...
};
static const unsigned FWMARK_NET_ID_MASK = 0xffff;
static_assert(sizeof(Fwmark) == sizeof(uint32_t), "The entire fwmark must fit into 32 bits");
system/netd/client/NetdClient.cpp
// accept() just calls accept4(..., 0), so there's no need to handle accept() separately.
extern "C" void netdClientInitAccept4(Accept4FunctionType* function) {
HOOK_ON_FUNC(function, libcAccept4, netdClientAccept4);
}
...
#define HOOK_ON_FUNC(remoteFunc, nativeFunc, localFunc) \
do { \
if ((remoteFunc) && *(remoteFunc)) { \
(nativeFunc) = *(remoteFunc); \
*(remoteFunc) = (localFunc); \
} \
} while (false)
...
// These variables are only modified at startup (when libc.so is loaded) and never afterwards, so
// it's okay that they are read later at runtime without a lock.
Accept4FunctionType libcAccept4 = nullptr;
...
int netdClientAccept4(int sockfd, sockaddr* addr, socklen_t* addrlen, int flags) {
int acceptedSocket = libcAccept4(sockfd, addr, addrlen, flags);
if (acceptedSocket == -1) {
return -1;
}
int family;
if (addr) {
family = addr->sa_family;
} else {
socklen_t familyLen = sizeof(family);
if (getsockopt(acceptedSocket, SOL_SOCKET, SO_DOMAIN, &family, &familyLen) == -1) {
return closeFdAndSetErrno(acceptedSocket, -errno);
}
}
if (FwmarkClient::shouldSetFwmark(family)) {
FwmarkCommand command = {FwmarkCommand::ON_ACCEPT, 0, 0, 0};
if (int error = FwmarkClient().send(&command, acceptedSocket, nullptr)) {
return closeFdAndSetErrno(acceptedSocket, error);
}
}
return acceptedSocket;
}
int netdClientConnect(int sockfd, const sockaddr* addr, socklen_t addrlen) {
const bool shouldSetFwmark = shouldMarkSocket(sockfd, addr);
if (shouldSetFwmark) {
FwmarkCommand command = {FwmarkCommand::ON_CONNECT, 0, 0, 0};
int error;
if (redirectSocketCallsIsTrue()) {
FwmarkConnectInfo connectInfo(0, 0, addr);
error = FwmarkClient().send(&command, sockfd, &connectInfo);
} else {
error = FwmarkClient().send(&command, sockfd, nullptr);
}
...
Netd Server
提供INetd接口的实现(NetdNativeService),app可以远程调用实现很多网络相关的操作。负责启动各种底层服务(比如fwmark server,mdns,dnsproxy),以供framework调用。
Linux内核之上的网络管理,由netd进程提供。用户态的网络管理进程/工具,在不同的linux发行版中本就不同,由不同的应用共同构成。netd就是android的实现,一个进程全都包括了。
Fwmark Server
自动选择netId,使用SO_MARK进行标志,用于路由识别。
system/netd/server/FwmarkServer.cpp
int FwmarkServer::processClient(SocketClient* client, int* socketFd) {
...
switch (command.cmdId) {
case FwmarkCommand::ON_CONNECT: {
// Called before a socket connect() happens. Set an appropriate NetId into the fwmark so
// that the socket routes consistently over that network. Do this even if the socket
// already has a NetId, so that calling connect() multiple times still works.
...
if (!fwmark.explicitlySelected) {
if (!fwmark.protectedFromVpn) {
fwmark.netId = mNetworkController->getNetworkForConnect(client->getUid());
} else if (!mNetworkController->isVirtualNetwork(fwmark.netId)) {
fwmark.netId = mNetworkController->getDefaultNetwork();
}
}
break;
...
if (setsockopt(*socketFd, SOL_SOCKET, SO_MARK, &fwmark.intValue,
sizeof(fwmark.intValue)) == -1) {
return -errno;
}
return 0;
...
NetworkController
结果最后都转为对linux网络的操作,就是ip rule和ip route的操作,不同的netid对应不同的table,而fwmark则用于rule的规则匹配。
system/netd/server/NetworkController.h
/*
* Keeps track of default, per-pid, and per-uid-range network selection, as
* well as the mark associated with each network. Networks are identified
* by netid. In all set* commands netid == 0 means "unspecified" and is
* equivalent to clearing the mapping.
*/
class NetworkController {
...
unsigned getDefaultNetwork() const;
[[nodiscard]] int setDefaultNetwork(unsigned netId);
unsigned getNetworkForUser(uid_t uid) const;
unsigned getNetworkForConnect(uid_t uid) const;
...
// |nexthop| can be NULL (to indicate a directly-connected route), "unreachable" (to indicate a
// route that's blocked), "throw" (to indicate the lack of a match), or a regular IP address.
//
// Routes are added to tables determined by the interface, so only |interface| is actually used.
// |netId| is given only to sanity check that the interface has the correct netId.
[[nodiscard]] int addRoute(unsigned netId, const char* interface, const char* destination,
const char* nexthop, bool legacy, uid_t uid, int mtu);
...
system/netd/server/NetworkController.cpp
int NetworkController::addRoute(unsigned netId, const char* interface, const char* destination,
const char* nexthop, bool legacy, uid_t uid, int mtu) {
return modifyRoute(netId, interface, destination, nexthop, ROUTE_ADD, legacy, uid, mtu);
}
int NetworkController::modifyRoute(unsigned netId, const char* interface, const char* destination,
const char* nexthop, enum RouteOperation op, bool legacy,
uid_t uid, int mtu) {
...
unsigned existingNetId = getNetworkForInterfaceLocked(interface);
...
switch (op) {
case ROUTE_ADD:
return RouteController::addRoute(interface, destination, nexthop, tableType, mtu,
0 /* priority */);
system/netd/server/RouteController.cpp
int RouteController::addRoute(const char* interface, const char* destination, const char* nexthop,
TableType tableType, int mtu, int priority) {
if (int ret = modifyRoute(RTM_NEWROUTE, NETLINK_ROUTE_CREATE_FLAGS, interface, destination,
nexthop, tableType, mtu, priority, false /* isLocal */)) {
return ret;
}
// Adds or removes an IPv4 or IPv6 route to the specified table.
// Returns 0 on success or negative errno on failure.
int RouteController::modifyRoute(uint16_t action, uint16_t flags, const char* interface,
const char* destination, const char* nexthop, TableType tableType,
int mtu, int priority, bool isLocal) {
uint32_t table;
switch (tableType) {
case RouteController::INTERFACE: {
table = getRouteTableForInterface(interface, isLocal);
int ret = modifyIpRoute(action, flags, table, interface, destination, nexthop, mtu, priority);
int modifyIpRoute(uint16_t action, uint16_t flags, uint32_t table, const char* interface,
const char* destination, const char* nexthop, uint32_t mtu, uint32_t priority) {
int ret = sendNetlinkRequest(action, flags, iov, ARRAY_SIZE(iov), nullptr);
最后,我还是没能理解为什么protect不生效,没道理啊。
2024.10.10 Update: minivtun-android已经可以使用protect。
2024.12.08 Update: 不能使用protect的原因终于找到了!因为我在调用vpnservice的protect()之前,调用了udp的connect()方法。一旦调用connect之后,linux内核就把路由查询完毕了,之后再通过protect设置fwmark已经晚了,内核不会再查询路由了!
以下是Claude-3.5-Sonnet生成的解释
Understanding the Mechanism
To understand this issue, we need to look at three key components:
UDP Connect Operation
- When you call connect() on a UDP socket, the kernel performs a route lookup
- This routing decision gets cached for subsequent send() operations
- The route is determined based on current socket attributes, including fwmark
VpnService.protect()
- Under the hood, protect() sets a special fwmark on the socket
- This fwmark tells the kernel to bypass VPN routing
- The fwmark is used during route lookup decisions
Kernel Route Caching
- Once a route is cached by connect(), it stays until the socket is disconnected
- Changing fwmark after connect() won't trigger a new route lookup
- The cached route continues to be used regardless of later fwmark changes