Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upRFC: add libgit2client and a command-line interface #5507
Conversation
5677c44
to
fb41349
|
You finally got around to implementing your dreams |
Indeed, this is a lot of changes. I think that it might make sense to get some high-level discussion out of the way before actually reviewing the code. Here's what I'd propose as a reviewing strategy:
I think that for actually reviewing the code itself that I should cleave off pieces in reviewable chunks. I'll keep this PR open with the mega branch (mostly for visibility, so that anybody can play with the actual bits itself) but I don't think that we should actually be reviewing this PR as a whole for getting merged. I'll cleave off the first piece soon-ish? There's some refactoring yet to do to get Windows working. |
0ea2e10
to
f10691b
libszen
commented
May 26, 2020
|
A reference client library and CLI would indeed be invaluable! Thank you very much for championing this P.R. A request: can the CLI be licensed with a more permissive license (public domain or Apache V2, ...), while retaining the current license for the client library? It will allow copy-pasting of CLI code without licensing issues by interested parties. This would also be consistent with existing examples being released in the public domain. |
Huh. I hadn't considered this. It may be that the libgit2 license is not necessarily appropriate for a CLI, but IANAL and I'd want to talk to one before I made any changes here. But I'd like to better understand your motivations around copy/pasting code. I'd like to have the business logic of dealing with client things in the libgit2client library. The CLI, IMO, would just be a way to parse the command line, and invoke pieces of libgit2 or libgit2client itself. How could we improve these two pieces to allow you to call into them instead of copy/pasting code out of the CLI project? |
30cf040
to
de774ce
9815952
to
c7c1ab6
I think having a higher-level API would be worthwhile to have and is one thing some people had been asking for in the past. I'm not sure I'm a huge fan of having it as a separate library, though, but personally I'd vote for having a build option for this instead. The most important benefit would be that we can avoid having a separate "util" library and thus don't require things like the One question I have about the middleware is how its interface should look like. I see that right now, it simply accepts an argv array and does the parsing internally. I wonder whether we want to split concerns here and have each command accept an options sturcture, where the struct's members strictly resemble the API options. The parsing logic would then reside in
Yeah, this is something we've repeatedly discussed in the past and something I'm happy to have. I'd prefer to avoid the underscore and call it
As said above I'm not much of a huge fan of having a shared internal libary, mostly due to my desire to keep a strict boundary between libgit2 the library and libgit2 the command line interface, and sharing some internal utilities blurs the lines. Most processing that's done in the CLI should be trivial command line parsing and execution of commands as the heavy-lifting is performed by the middleware anyway. And as I'm of the opinion that the middleware should live inside of the libgit2 library, there wouldn't be any need for the util library. |
I had not considered this route... it feels weird to me that you could have two very different versions of the library, one that has a bunch of functions that are more "client" facing functions, and ones that do not. Would these two different varieties of the library have different sonames? If not, that doesn't feel right, since you'd expect ABI compatibility. Or would we have stubs that just throw? That's sort of meh as well.
I agree. The CLI should not be like libgit2_clar, where it actually has some insights into private functions. However, we would never want to reimplement But, putting that aside for a moment, it sounds like you're describing something a bit different than the direction that I've pursued: a high level API call that matches a git porcelain command. eg, But is this really useful? I concede, though, that this PR doesn't give people a lot to help in building a client off of. And, of course, I could be wrong. And the data structure that we parse command-line options into could just be the input to an API. And if it's useful for people, great, and if it's not, that's fine, too. There's no obligation for them to use it, but it does give people who want a simple mechanism to "run the CLI" that option. It might be useful to know what the folks at TortoiseGit (@csware), GitKraken (@implausible) would want in a client API. |
The `git_buf` type is now no longer a publicly available structure, and the `git_buf` family of functions are no longer exported. The deprecation layer adds a typedef for `git_buf` (as `git_userbuf`) and macros that define `git_buf` functions as `git_userbuf` functions. This provides API (but not ABI) compatibility with libgit2 1.0's buffer functionality. Within libgit2 itself, we take care to avoid including those deprecated typedefs and macros, since we want to continue using the `git_buf` type and functions unmodified. Therefore, a `GIT_DEPRECATE_BUF` guard now wraps the buffer deprecation layer. libgit2 will define that.
`git_strarray` is a public-facing type. Chagne `git_buf_text_common_prefix` to not use it, and just take an array of strings instead.
Our options parsing system can also be used as the basis for displaying command-line usage. Add usage information, using knowledge of the console (if we're attached to one) for wrapping nicely.
Set up a framework for subcommands, and introduce the first, "help". Help will display the commands available, and information about the help command itself. Commands are expected to provide their own usage and help information, which the help command will proxy to when necessary.
Provide a helper function to copy a number of strings from the source to the target.
SSH paths come in a variety of formats, either URLs (ssh://user@host/path) or SCP style (user@host:path). Provide a mechanism to parse them.
Provide a class that will display progress information to the console. Initially, it contains callbacks for fetch progress and checkout progress.
As we consume parts of the libgit2 utility functions (like `git_buf`), we will inevitably need to allocate. Since we re-use the libgit2 allocation functions - but linked into our application - we'll need to configure our allocation strategy ahead of time.
Add a new source directory, `util`, that contains utility functions like buffers, vectors, etc, that that are general purpose and not necessarily part of libgit2 itself. These utility functions can be used by additional projects.
Introduce libgit2client, a client "middleware" library. This is an experimental set of utility functions and classes for client software that builds on top of libgit2. This library might contain - for example - code that invokes filters or other tools. This is incredibly useful to share and reuse among consumers but should be excluded from libgit2 itself. Users may, understandably, not want code that executes arbitrary other commands in libgit2 itself.
Introduce a command-line interface for libgit2. The goal is to be
git-compatible, so that:
1. By creating a git client ourselves, we can understand the needs of
git clients and produce a common "middleware" for commonly-used
pieces of client functionality. For example: interacting with
other command-line tools, like filter drivers or merge drivers.
This can assist other git clients.
2. We can benefit from git's unit tests, running their test suite
against our own CLI to ensure correct behavior.
3. We can easily benchmark ourselves against git to understand where we
are poorly performing, by running identical commands between git and
ourselves.
4. We can easily A/B test ourselves against git, at least for read-only
operations, which will ensure that we are producing identical output.
This commit introduces a simple infrastructure for the CLI.
The functions exported by libgit2 should stay in the libgit2 directory. By putting them in `util`, they'll be further exported by any other tool that uses the util library.
The test tree should - ideally - match the source tree. The libgit2 specific tests should move into tests/libgit2. The name of the resulting test binary is now 'libgit2_tests' for ease of consumption of new contributors.
Other subsystems (like libgit2client) likely want to add testing. Move clar into its own directory so that it can be reused and not duplicated.
Put the common clar test function into the tests/CMakeLists.txt for reusability.
Provide a mechanism to add a signal handler for Unix or Win32.
Provide functions to search through string arrays.
Add a function to add a search/replaced string to a git_buf.
Introduce a helper method to quote a string in a shellsafe manner. This wraps the entire buffer in single quotes, escaping single-quotes and exclamation points.
Provide a mechanism for the client library to report a unique error class. This will not be used by libgit2 directly.


ethomson commentedMay 10, 2020
This pull request expands the scope of libgit2 from just being the (relatively) low-level library for dealing with repositories to add two new pieces of functionality:
libgit2client: a "middleware" library that is focused on providing the functionality that client applications would find useful: for example, a subtransport that uses the ssh command-line, and a filter implementation that runs the commands specified in the attributes.
Things like LFS and executing
sshhave been well outside the scope of libgit2 since its implementation, however we've been forcing every tool that wants to build a client on top of us to deal with these problems themselves. We should provide this functionality for them.I propose putting this in a separate library since not everybody finds it valuable and - frankly - if I were running a git hosting provider, I would prefer not to have
forkandexecin the codebase at all if I could avoid it.git2_cli: a command-line interface that emulates git itself for testing. This allows us to test our client library, to eat our own dogfood, and - potentially - to start trying to use git's command-line tests against our own CLI to validate compatibility.
util: the shared bits of code (buffer manipulation, etc) that all the layers will want to use. This is not provided to end-users, just used by the two shared libraries and the CLI application.